Data Warehousing Basics Interview Questions & Answers - Learning Mode

A data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from multiple heterogeneous sources that support analytical reporting, structured and/or ad hoc queries, and decision making.

Question: What is the difference between view and materialized view?

Answer: View - store the SQL statement in the database and let you use it as a table. Everytime you access the view, the SQL statement executes.

Materialized view - stores the results of the SQL in table form in the database. SQL statement only executes once and after that everytime you run the query, the stored result set is used. Pros include quick query results.

Question: What is BUS Schema?

Answer: BUS Schema is composed of a master suite of confirmed dimension and standardized definition if facts. Source:
Question: What is Dimensional Modelling

Answer: Dimensional Modelling is a design concept used by many data warehouse desginers to build thier datawarehouse. In this design model all the data is stored in two types of tables - Facts table and Dimension table. Fact table contains the facts/measurements of the business and the dimension table contains the context of measuremnets ie, the dimensions on which the facts are calculated. Source:
Question: What is a Fact,Dimension,Measure?

Answer: Fact is key performance indicator to analyze the business.Dimension is used to analyze the fact.Without dimension there is no meaning for fact. Source:
Question: What is conformed fact?

Answer: Conformed dimensions are the dimensions which can be used across multiple Data Marts in combination with multiple facts tables accordingly

Question: What is hybrid slowly changing dimension

Answer: Hybrid SCDs are combination of both SCD 1 and SCD 2.

It may happen that in a table, some columns are important and we need to track changes for them i.e capture the historical data for them whereas in some columns even if the data changes, we don't care.

For such tables we implement Hybrid SCDs, where in some columns are Type 1 and some are Type 2.
Question: Why fact table is in normal form?

Answer: Basically the fact table consists of the Index keys of the dimension/ook up tables and the measures.

so when ever we have the keys in a table .that itself implies that the table is in the normal form.

Question: can a dimension table contains numeric values?

Answer: Yes.But those datatype will be char (only the values can numeric/char)

Question: What is the data type of the surrogate key?

Answer: Data type of the surrogate key is either integer or numeric or number
Question: what is junk dimension?
what is the difference between junk dimension and degenerated dimension?

Answer: Junk dimension: Grouping of Random flags and text Attributes in a dimension and moving them to a separate sub dimension.

Degenerate Dimension: Keeping the control information on Fact table ex: Consider a Dimension table with fields like order number and order line number and have 1:1 relationship with Fact table, In this case this dimension is removed and the order information will be directly stored in a Fact table inorder eliminate unneccessary joins while retrieving order informat Source:
Question: What is Data warehosuing Hierarchy?

Answer: Hierarchies
Hierarchies are logical structures that use ordered levels as a means of organizing data. A hierarchy can be used to define data aggregation. For example, in a time dimension, a hierarchy might aggregate data from the month level to the quarter level to the year level. A hierarchy can also be used to define a navigational drill path and to establish a family structure.

Within a hierarchy, each level is logically connected to the levels above and below it. Data values at low Source:
Question: What are the possible data marts in Retail sales.?

Answer: Product information,sales information Source:
Question: Wht r the data types present in bo?n wht happens if we implement view in the designer n report

Answer: Three different data types: Dimensions,Measure and Detail.

View is nothing but an alias and it can be used to resolve the loops in the universe.
Question: What are the advantages data mining over traditional approaches?

Answer: Data Mining is used for the estimation of future. For example, if we take a company/business organization, by using the concept of Data Mining, we can predict the future of business interms of Revenue (or) Employees (or) Cutomers (or) Orders etc.

Traditional approches use simple algorithms for estimating the future. But, it does not give accurate results when compared to Data Mining.
Question: What is the datatype of the surrogate key?

Answer: Datatype of the surrogate key is either integer or numeric or number.<br> Source:
Question: Differences between star and snowflake schemas

Answer: Star schema

A single fact table with N number of Dimension

Snowflake schema

Any dimensions with extended dimensions are know as snowflake schema

Question: What is data validation strategies for data mart validation after loading process

Answer: Data validation is to make sure that the loaded data is accurate and meets the business requriments.

Strategies are different methods followed to meet the validation requriments
Question: Can any body explain clearly how to explain any (sales) project in interview.actually feom where report developer work starts?pls reply as soon as possible?

Answer: if you are a Report developer
1,you have to specify the front end and back end tool used for creating the reports
2,Then you have to tell the purpose of the project..what you are going to acheive using the reports.
3,Then you can explain the backend part which is important.FOr example,you have to tell what are all the facts and dimension going to be used
4, Once the facts and dimension are identified yo might want to restructure the fact and dimension using the views.Also have to Source:
Question: What are the vaious ETL tools in the Market

Answer: Various ETL tools used in market are:

Data Stage
Oracle Warehouse Bulider
Ab Initio
Data Junction
Question: What is Difference between E-R Modeling and Dimentional Modeling.

Answer: Basic diff is E-R modeling will have logical and physical model. Dimensional model will have only physical model.

E-R modeling is used for normalizing the OLTP database design.

Dimensional modeling is used for de-normalizing the ROLAP/MOLAP design.

