Abinitio Interview Questions & Answers

Abinitio Interview Questions & Answers - Learning Mode

The Ab Initio software is a fourth generation data analysis, batch processing, data manipulation graphical user interface (GUI)-based parallel processing tool which is commonly used to extract, transform and load data.

Question: How many parallelisms are in Abinitio? Please give a definition of each.


There are 3 types of parallelism in ab-initio.

1) Data Parallelism: Data is processed at the different servers at the same time.

2) Pipeline parallelism: In this the records are processed in pipeline, i.e. the components do not have to wait for all the records to be processed. The records that got processed are passed to next component in pipeline.

3) Component Parallelism: In this two or more components process the records in Source:
Question: How do you connect EME to Abinitio Server?

Question: Why might you create a stored procedure with the 'with recompile' option?

Answer: Recompile is useful when the tables referenced by the stored proc undergoes a lot of modification/deletion/addition of data. Due to the heavy modification activity the execute plan becomes outdated and hence the stored proc performance goes down. If we create the stored proc with recompile option, the sql server wont cache a plan for this stored proc and it will be recompiled every time it is run. Source:
Question: What does dependency analysis mean in Ab Initio?

Answer: Dependency analysis will answer the questions regarding datalinage.that is where does the data come from,what applications prodeuce and depend on this data etc.. Source:
Question: How to Create Surrogate Key using Ab Initio?

Answer: There r many ways to create Surrogatekey but it depends on your business logic. here u can try these ways...

1. use next_in_sequence() function in your transform.

2.use Assign key values component (if ur gde is higher than 1.10)

3.write a stored proc to this and call this stor proc wherever u need

Question: What is the difference between sandbox and EME, can we perform checkin and checkout through sandbox/ Can anybody explain checkin and checkout?

Answer: Sandboxes are work areas used to develop, test or run code associated with a given project. Only one version of the code can be held within the sandbox at any time.
The EME Datastore contains all versions of the code that have been checked into it. A particular sandbox is associated with only one Project where as a Project can be checked out to a number of sandboxes Source:
Question: What is m_dump

Answer: m_dump command prints the data in a formatted way.

m_dump <dml> <file.dat>
Question: what is difference between file and table in abinitio

Answer: Table means it maintaince relational data i.e it is a relational structure.
File means non relation structure.
it maintaince data.

Question: What is driving port? When do you use it?

Answer: When you set the sorted-input parameter of "JOIN" component to "In memory: Input need not be sorted", you can find the driving port.
Generally driving port use to improve performance in a graph.

The driving input is the largest input. All other inputs are read into memory.

For example, suppose the largest input to be joined is on the in1 port. Specify a port number of 1 as the value of the driving parameter. The component reads all other inputs to the join ? for example, in0, Source:
Question: What are the contineous components in Abinitio?

Answer: Contineous components used to create graphs,that produce useful output file while running continously

Ex:- Contineous rollup,Contineous update,batch subscribe
Question: What are primary keys and foreign keys?

Answer: In RDBMS the relationship between the two tables is represented as Primary key and foreign key relationship.Wheras the primary key table is the parent table and foreignkey table is the child table.The criteria for both the tables is there should be a matching column. Source:
Question: What is data mapping and data modelling?

Answer: data mapping deals with the transformation of the extracted data at FIELD level i.e. the transformation of the source field to target field is specified by the mapping defined on the target field. The data mapping is specified during the cleansing of the data to be loaded.

For Example:


string(35) name = "Siva Krishna ";


string("01") nm=NULL("");/*(maximum length is string(35))*/

Then we can have a mapping like:

Straight move.T Source:
Question: How to create repository in abinitio for stand alone system(LOCAL NT)?

Answer: If you are trying to install the Ab -Initio on stand alone machine , then it is not necessary to create the repository , While installing It creates automatically for you under abinitio folder ( where you installing the Ab-Initio) If you are still not clear please ask your Question on the same portal Source:
Question: How to work with parameterized graphs?

Answer: One of the main purpose of the parameterized graphs is that if we need to run the same graph for n number of times for different files, we set up the graph parameters like $INPUT_FILE, $OUTPUT_FILE etc and we supply the values for these in the Edit>parameters.These parameters are substituted during the run time. we can set different types of parameters like positional, keyword, local etc.

The idea here is, instead of maintaining different versions of the same graph, we can maintain one Source:
Question: How Does MAXCORE works?

Answer: Maxcore is a value (it will be in Kb).Whne ever a component is executed it will take that much memeory we specified for execution Source:
Question: What is AB_LOCAL expression where do you use it in ab-initio?

Answer: ablocal_expr is a parameter of itable component of Ab Initio.ABLOCAL() is replaced by the contents of ablocal_expr.Which we can make use in parallel unloads.There are two forms of AB_LOCAL() construct, one with no arguments and one with single argument as a table name(driving table).

The use of AB_LOCAL() construct is in Some complex SQL statements contain grammar that is not recognized by the Ab Initio parser when unloading in parallel. You can use the ABLOCAL() construct in this case to Source:
Question: Explain the difference between the ?truncate? and "delete" commands.

Answer: Truncate :- It is a DDL command, used to delete tables or clusters. Since it is a DDL command hence it is auto commit and Rollback can't be performed. It is faster than delete.

Delete:- It is DML command, generally used to delete a record, clusters or tables. Rollback command can be performed , in order to retrieve the earlier deleted things. To make deleted things permanently, "commit" command should be used.
Question: What are the most commonly used components in a Abinition graph?

can anybody give me a practical example of a trasformation of data, say customer data in a credit card company into meaningful output based on business rules?

Answer: The most commonly used components in to any Ab Initio project are

input file/output file

input table/output table

lookup file

reformat,gather,join,runsql,join with db,compress components,sort,trash,partition by expression,partition by key ,concatinate

Question: How will you test a dbc file from command prompt ??

Answer: You can test a dbc file from unix command prompt using m_db test <name-of-the dbc file> which tests the data base connection, data base version, user name, password, database objects and other required mandatory values checking.

Here, the syntax is on below

m_db test <name-of-the dbc file>


m_db test my_db_connection_test.dbc

Here the my_db_connection_test.dbc file should be in current directory.
Question: What is .abinitiorc and What it contain?

Answer: .abinitiorc is the config file for ab initio. It is found in user's home directory. Generally it is used to contain abinitio home path, different log in information like id encrypted password login method for hosts where the graph connects in time of execution.

It may contain inf like EME host and others.

