FINALTERM EXAMINATION
Spring 2009
CS614- Data Warehousing
M a r k s: 70
Question No: 1 ( M a r k s: 1 ) http://vuzs.net
It is observed that every year the amount of data recorded in an organization is
Doubles (handouts page # 6)
Triples
Quartiles
Remains same as previous year
Question No: 2 ( M a r k s: 1 ) http://vuzs.net
Multidimensional databases typically use proprietary __________ format to store
pre-summarized cube structures.
File ( Page # 69 )
Application
Aggregate
Database
Question No: 3 ( M a r k s: 1 ) http://vuzs.net
Pre-computed _______ can solve performance problems
Aggregates (page # 101)
Facts
Dimensions
Question No: 4 ( M a r k s: 1 ) http://vuzs.net
_______________, if fits into memory, costs only one disk I/O access to locate a
record by given key.
A Dense Index (page # 211)
A Sparse Index
An Inverted Index
None of These
Question No: 5 ( M a r k s: 1 ) http://vuzs.net
The degree of similarity between two records, often measured by a numerical
value between _______, usually depends on application characteristics.
0 and 1 (page # 157 )
0 and 10
0 and 100
0 and 99
Question No: 6 ( M a r k s: 1 ) http://vuzs.net
The purpose of the House of Quality technique is to reduce ______ types of risk.
Two (page # 181)
Three
Four
All
Question No: 7 ( M a r k s: 1 ) http://vuzs.net
NUMA stands for __________
Non-uniform Memory Access ( page # 194)
Non-updateable Memory Architecture
New Universal Memory Architecture
Question No: 8 ( M a r k s: 1 ) http://vuzs.net
Which is the least appropriate join operation for Pipeline parallelism?
Hash Join
Inner Join
Outer Join
Sort-Merge Join
Question No: 9 ( M a r k s: 1 ) http://vuzs.net
There are many variants of the traditional nested-loop join. If the index is built as
part of the query plan and subsequently dropped, it is called
Naive nested-loop join
Index nested-loop join
Temporary index nested-loop join ( page # 230)
None of these
Question No: 10 ( M a r k s: 1 ) http://vuzs.net
Data mining derives its name from the similarities between searching for valuable
business information in a large database, for example, finding linked products in
gigabytes of store scanner data, and mining a mountain for a _________ of
valuable ore.
Furrow
Streak
Trough
Vein
Question No: 11 ( M a r k s: 1 )
With data mining, the best way to accomplish this is by setting aside some of
your data in a ________ to isolate it from the mining process; once the mining is
complete, the results can be tested against the isolated data to confirm the
model's validity.Cell
Disk
Folder
Vault
Question No: 12 ( M a r k s: 1 ) http://vuzs.net
The Kimball s iterative data warehouse development approach drew on decades
of experience to develop the _____________.
Business Dimensional Lifecycle (page # 276 )
Data Warehouse Dimension
Business Definition Lifecycle
OLAP Dimension
Question No: 13 ( M a r k s: 1 ) http://vuzs.net
We must try to find the one access tool that will handle all the needs of their
users.
True
False
Question No: 14 ( M a r k s: 1 ) http://vuzs.net
For a smooth DWH implementation we must be a technologist. True
False (page # 306)
Question No: 15 ( M a r k s: 1 ) http://vuzs.net
During the application specification activity, we also must give consideration to
the organization of the applications.
True ( page # 294 )
False
Question No: 16 ( M a r k s: 1 ) http://vuzs.net
Investing years in architecture and forgetting the primary purpose of solving
business problems, results in inefficient application. This is the example of
_________ mistake.
Extreme Technology Design
Extreme Architecture Design
None of these (page # 303)
Question No: 17 ( M a r k s: 1 ) http://vuzs.net
The most recent attack is the ________ attack on the cotton crop during 2003-
04, resulting in a loss of nearly 0.5 million bales.
Boll Worm (VIDO LECTURE # 38)
Purple Worm
Blue Worm
Cotton Worm
Question No: 18 ( M a r k s: 1 ) http://vuzs.net
The users of data warehouse are knowledge workers in other words they are
_________ in the organization. Decision maker (page# 10 )
Manager
Database Administrator
DWH Analyst
Question No: 19 ( M a r k s: 1 ) http://vuzs.net
_________ breaks a table into multiple tables based upon common column
values.
Horizontal splitting (page # 46 )
Vertical splitting
Question No: 20 ( M a r k s: 1 ) http://vuzs.net
Execution can be completed successfully or it may be stopped due to some
error. In case of successful completion of execution all the transactions will be
___________
Committed to the database (page # 398 last line)
Rolled back
Question No: 21 ( M a r k s: 2 )
What is meant by the statement Be a diplomat NOT a technologist in the
context of a data warehouse development project?
7. Be a diplomat NOT a technologist
The biggest problem you will face during a warehouse implementation will be people, not the technology or the development. You’re going to have senior management complaining about completion dates and unclear objectives. You’re going to have development people protesting that everything takes too long and why can’t they do it the old way? You’re going to have users with outrageously unrealistic expectations, who are used to systems that require mouse-clicking but not much intellectual investment on their part. And you’re going to grow exhausted, separating out Needs from Wants at all levels. Commit from the outset to work very hard at communicating the realities, encouraging investment, and cultivating the development of new skills in your team and your users (and even your bosses).
Question No: 22 ( M a r k s: 2 )
Elaborate the concept of data parallelism.Parallel execution of a single data manipulation task across multiple partitions of data.
Partitions static or dynamic
Tasks executed almost-independently across partitions.
“Query coordinator” must coordinate between the independently executing processes.
So data parallelism is I think the simplest form of parallelization. The idea is that we have parallel execution of single data operation across multiple partitions of data. So the idea here is that these partitions of data may be defined statically or dynamically fine, but we are requiring the same operator across these multiple partitions concurrently. And this idea actually of data parallelism has existed for a very long time.
www.vuzs.net
Question No: 23 ( M a r k s: 2 )
What will be the effect if we program a package by using DTS object model?
Question No: 24 ( M a r k s: 3 )
What is meant by the classification process? How we measure the accuracy of
classifiers?
Classification means that based on the properties of existing data, we have made or groups i.e. we have made classification.
Question No: 25 ( M a r k s: 3 )
How page dimension captures the static and dynamic nature of different web
pages?
Question No: 26 ( M a r k s: 3 )
Write down the limitations of pipelining parallelism?
Pipeline parallelism is a good fit for data warehousing (where we are working with lots of data), but it makes no sense for OLTP because OLTP tasks are not big enough to justify breaking them down into subtasks.
Question No: 27 ( M a r k s: 5 )
For a maximum performance of Bitmapped index, what characteristics a query
should have?
Question No: 28 ( M a r k s: 5 )
How the three parallel tracks capture the user requirements in the Kimball s data
warehouse life cycle Road Map?
Question No: 29 ( M a r k s: 5 )
How time contiguous log entries and HTTP secure socket layer are used for user
session identification? What are the limitations of these techniques?
Question No: 30 ( M a r k s: 10 )
What are the issues regarding the record management tools at campuses where
text files are used to store data?
Main issues
Data duplication
Update the data
Data deletion
We can easily elaborate these issues
Question No: 31 ( M a r k s: 10 )
Shared RDBMS architecture requires a static partitioning. How do you perform the
partitioning.