What is Data
Data
is information that has been translated into a form that is more convenient
to move or process.
What
is Database
Database
is an organized collection of related data.
What
is warehouse
Warehouse
is the place where we store data.
What is Data Mart
Data mart contains a subset of
organization-wide data. This subset of data is valuable to specific groups of
an organization.
In other words, we can claim that data marts
contain data specific to a particular group. For example, the marketing data
mart may contain data related to items, customers, and sales. Data marts are
confined to subjects.
What
is Data Warehouse
A data
warehouse is a centralized repository that stores data from multiple
information sources and transforms them into a common, multidimensional data
model for efficient querying and analysis.
The most popular definition came from Bill
Inmon, who provided the following:
“A data
warehouse is a subject-oriented, integrated, time-variant and non-volatile
collection of data in support of management’s decision making process.”
Subject-Oriented:
A data
warehouse can be used to analyze a particular subject area. For example,
“sales” can be a particular subject.
Integrated:
A data
warehouse integrates data from multiple data sources. For example, source A and
source B may have different ways of identifying a product, but in a data
warehouse, there will be only a single way of identifying a product.
Time-Variant:
Historical
data is kept in a data warehouse. For example, one can retrieve data from 3
months, 6 months, 12 months, or even older data from a data warehouse. This
contrasts with a transactions system, where often only the most recent data is
kept. For example, a transaction system may hold the most recent address of a
customer, where a data warehouse can hold all addresses associated with a
customer.
Non-volatile:
Once data
is in the data warehouse, it will not change. So, historical data in a data
warehouse should never be altered.
OLAP vs OLTP
Sr.No.
|
Data Warehouse (OLAP)
|
Operational Database
(OLTP)
|
1
|
Involves historical
processing of information.
|
Involves day-to-day
processing.
|
2
|
OLAP systems are used
by knowledge workers such as executives, managers and analysts.
|
OLTP systems are used
by clerks, DBAs, or database professionals.
|
3
|
Useful in analyzing the
business.
|
Useful in running the
business.
|
4
|
It focuses on
Information out.
|
It focuses on Data in.
|
5
|
Based on Star Schema,
Snowflake, Schema and Fact Constellation Schema.
|
Based on Entity
Relationship Model.
|
6
|
Contains historical
data.
|
Contains current data.
|
7
|
Provides summarized and
consolidated data.
|
Provides primitive and
highly detailed data.
|
8
|
Provides summarized and
multidimensional view of data.
|
Provides detailed and
flat relational view of data.
|
9
|
Number or users is in
hundreds.
|
Number of users is in
thousands.
|
10
|
Number of records
accessed is in millions.
|
Number of records accessed
is in tens.
|
11
|
Database size is from
100 GB to 1 TB
|
Database size is from
100 MB to 1 GB.
|
12
|
Highly flexible.
|
Provides high
performance.
|
What is Fact
table
A fact table is a table that
contains summarized numerical and historical data (facts) and a multipart index
composed of foreign keys from the primary keys of related dimension tables.
Dimensions
In data warehousing, a dimension is a collection of reference information about a measurable event. These events are known as facts and are stored in a fact table.
In data warehousing, a dimension is a collection of reference information about a measurable event. These events are known as facts and are stored in a fact table.
Dimensions categorize and
describe data warehouse facts and measures in ways that support meaningful
answers to business questions. They form the very core of dimensional modeling.
What is Dimension tables
Dimension tables are referenced by fact tables using keys. When creating a dimension table in a data warehouse, a system-generated key is used to uniquely identify a row in the dimension. This key is also known as a surrogate key. The surrogate key is used as the primary key in the dimension table. The surrogate key is placed in the fact table and a foreign key is defined between the two tables. When the data is joined, it does so just as any other join within the database.
Dimension tables are referenced by fact tables using keys. When creating a dimension table in a data warehouse, a system-generated key is used to uniquely identify a row in the dimension. This key is also known as a surrogate key. The surrogate key is used as the primary key in the dimension table. The surrogate key is placed in the fact table and a foreign key is defined between the two tables. When the data is joined, it does so just as any other join within the database.
Characteristics of Data Warehouse:
i. Subject-oriented :
The warehouse
organizes data around the essential subjects of the business (customers and
products) rather than around applications such as inventory management or
order processing.
i. Integrated:
It is
consistent in the way that data from several sources is extracted and transformed.
For example, coding conventions are standardized: M _ male, F _ female.
ii. Time-variant:
Data are
organized by various time-periods (e.g. months).
iii. Non-volatile:
The
warehouse’s database is not updated in real time. There is periodic bulk
uploading of transactional and other data.
The Advantages of Data
Warehousing
- Clean
data: required info
- Indexes:
multiple types
- Query
processing: multiple options, faster
- Security:
data and access(multiple views)
- Easy
report creation.
- Enhanced
access to data and information.
The
Disadvantages of Data Warehousing
- Preparation
may be time consuming.
- Compatibility
with existing systems.
- Security
issues.
Data Warehouse Architecture
Types of Dimensions
Dimension
A dimension table typically has two types of
columns, primary keys to fact tables and textual\descriptive data.
Eg: Time, Customer
Types of Dimensions
1.
Slowly Changing Dimensions
2.
Rapidly Changing Dimensions
3.
Junk Dimensions
4.
Inferred Dimensions
5.
Conformed Dimensions
6.
Degenerate Dimensions
7.
Role Playing Dimensions
8.
Shrunken Dimensions
9.
Static Dimensions
Slowly Changing
Dimensions
Attributes
of a dimension that would undergo changes over time. It depends on the business
requirement whether particular attribute history of changes should be preserved
in the data warehouse. This is called a slowly changing attribute and a
dimension containing such an attribute is called a slowly changing dimension.
Rapidly Changing Dimensions
A dimension attribute that changes frequently is a rapidly
changing attribute. If you don’t need to track the changes, the rapidly
changing attribute is no problem, but if you do need to track the changes,
using a standard slowly changing dimension technique can result in a huge
inflation of the size of the dimension. One solution is to move the attribute
to its own dimension, with a separate foreign key in the fact table. This new
dimension is called a rapidly changing dimension.
Junk Dimensions
A junk dimension is a single table with a combination of different
and unrelated attributes to avoid having a large number of foreign keys in the
fact table. Junk dimensions are often created to manage the foreign keys
created by rapidly changing dimensions.
Inferred Dimensions
While loading fact records, a dimension record may not yet be
ready. One solution is to generate a surrogate key with null for all the other
attributes. This should technically be called an inferred member, but is often
called an inferred dimension.
Conformed Dimensions
A dimension that is used in multiple locations is called a
conformed dimension. A conformed dimension may be used with multiple fact
tables in a single database, or across multiple data marts or data warehouses.
Degenerate Dimensions
A degenerate dimension is when the dimension attribute is stored
as part of fact table, and not in a separate dimension table. These are
essentially dimension keys for which there are no other attributes. In a data
warehouse, these are often used as the result of a drill through query to analyze
the source of an aggregated number in a report. You can use these values to
trace back to transactions in the OLTP system.
Role Playing Dimensions
A role-playing dimension is one where the same dimension key —
along with its associated attributes — can be joined to more than one foreign
key in the fact table. For example, a fact table may include foreign keys for
both ship date and delivery date. But the same date dimension attributes apply
to each foreign key, so you can join the same dimension table to both foreign
keys. Here the date dimension is taking multiple roles to map ship date as well
as delivery date, and hence the name of role playing dimension.
Shrunken Dimensions
A shrunken dimension is a subset of another dimension. For
example, the orders fact table may include a foreign key for product, but the
target fact table may include a foreign key only for productcategory, which is
in the product table, but much less granular. Creating a smaller dimension
table, with productcategory as its primary key, is one way of dealing with this
situation of heterogeneous grain. If the product dimension is snowflaked, there
is probably already a separate table for productcategory, which can serve as
the shrunken dimension.
Static Dimensions
Static dimensions are not extracted from the original data source,
but are created within the context of the data warehouse. A static dimension
can be loaded manually — for example with status codes — or it can be generated
by a procedure, such as a date or time dimension.
Types of Facts
There are three types of facts:
- Additive: Additive facts
are facts that can be summed up through all of the dimensions in the fact
table.
- Semi-Additive: Semi-additive
facts are facts that can be summed up for some of the dimensions in the
fact table, but not the others.
- Non-Additive: Non-additive
facts are facts that cannot be summed up for any of the dimensions present
in the fact table.
Let us use examples to illustrate
each of the three types of facts. The first example assumes that we are a
retailer, and we have a fact table with the following columns:
Date
|
Store
|
Product
|
Sales_Amount
|
The purpose of this table is to record the sales amount for each
product in each store on a daily basis. Sales_Amount is the fact. In this case, Sales_Amount is an additive fact, because you can sum up
this fact along any of the three dimensions present in the fact table -- date,
store, and product. For example, the sum of Sales_Amount for all 7 days in a week represents the total
sales amount for that week.
Say we are a bank with the
following fact table:
Date
|
Account
|
Current_Balance
|
Profit_Margin
|
The purpose of this table is to record the current balance for
each account at the end of each day, as well as the profit margin for each
account for each day. Current_Balance and Profit_Margin are the facts. Current_Balance is a semi-additive fact, as it makes sense to
add them up for all accounts (what's the total current balance for all accounts
in the bank?), but it does not make sense to add them up through time (adding
up all current balances for a given account for each day of the month does not
give us any useful information). Profit_Margin is a non-additive fact, for it does not make
sense to add them up for the account level or the day level.
The
Types of Fact Table are
1.
Snapshot
2.
Cumulative
3.
Fact less Fact Table
Snapshot
This type of fact table
describes the state of things in a particular instance of time, and usually
includes more semi-additive and non-additive facts. The second example
presented here is a snapshot fact table.
Eg: Daily balances fact can
be summed up through the customers dimension but not through the time
dimension.
Cumulative
This type of fact table
describes what has happened over a period of time. For example, this fact table
may describe the total sales by product by store by day. The facts for this
type of fact tables are mostly additive facts. The first example presented here
is a cumulative fact table.
Eg: Sales fact
Fact less Fact Table
In the real world, it is
possible to have a fact table that contains no measures or facts. These tables
are called “Fact less Fact tables”.
Eg: A fact table which has
only product key and date key is a fact less fact. There are no measures
in this table. But still you can get the number products sold over a period of
time.
Slowly Changing Dimension Types, Advantages & Disadvantages
There
are many approaches how to deal with SCD. The most popular are:
·
Type 0 - The passive method
·
Type 1 - Overwriting the old value
·
Type 2 - Creating a new additional record
·
Type 3 - Adding a new column
·
Type 4 - Using historical table
· Type 6 -
Combine approaches of types 1,2,3 (1+2+3=6)
Type 0 - The
passive method. In this method no special action is performed upon dimensional
changes. Some dimension data can remain the same as it was first time inserted,
others may be overwritten.
Example: Employee birth date.
Type 1: The new record replaces the original record.
No trace of the old record exists.
Type 2: A new record is added into the customer
dimension table. Thereby, the customer is treated essentially as two people.
Type 3: The original record is modified to reflect
the change.
SCD Type 1, SCD Type 2, SCD Type 3, Slowly Changing Dimension
Types, Advantages & Disadvantages
In Type 1 Slowly Changing
Dimension, the new information simply overwrites the original information. In
other words, no history is kept.
In our example, recall we originally
have the following table:
Customer Key
|
Name
|
State
|
1001
|
Williams
|
New
York
|
After Williams moved from
New York to Los Angeles, the new information replaces the new record, and we
have the following table:
Customer Key
|
Name
|
State
|
1001
|
Williams
|
Los
Angeles
|
Advantages
·
This is the easiest way to handle the Slowly Changing Dimension
problem, since there is no need to keep track of the old information.
Disadvantages
·
All history is lost. By applying this methodology, it is not
possible to trace back in history. For example, in this case, the company would
not be able to know that Williams lived in New York before.
When to use Type 1
Type 1 slowly changing
dimension should be used when it is not necessary for the data warehouse to
keep track of historical changes.
Example:
2
Type 1 - Overwriting the old
value. In this method no history of dimension changes is kept in the database.
The old dimension value is simply overwritten be the new one. This type is easy
to maintain and is often use for data which changes are caused by processing
corrections(e.g. removal special characters, correcting spelling errors).
Before the change:
Before the change:
Customer_ID
|
Customer_Name
|
Customer_Type
|
1
|
Cust_1
|
Corporate
|
After the change:
Customer_ID
|
Customer_Name
|
Customer_Type
|
1
|
Cust_1
|
Retail
|
SCD Type 2, Slowly Changing
Dimension Use, Example, Advantage, Disadvantage
In Type 2 Slowly Changing
Dimension, a new record is added to the table to represent the new information.
Therefore, both the original and the new record will be present. The new record
gets its own primary key.
In our example, recall we
originally have the following table:
Customer Key
|
Name
|
State
|
1001
|
Williams
|
New
York
|
After Williams moved from
New York to Los Angeles, we add the new information as a new row into the
table:
Customer Key
|
Name
|
State
|
1001
|
Williams
|
New
York
|
1005
|
Williams
|
Los
Angeles
|
Advantages
·
This allows us to accurately keep all historical information.
Disadvantages
·
This will cause the size of the table to grow fast. In cases where
the number of rows for the table is very high to start with, storage and
performance can become a concern.
·
This necessarily complicates the ETL process.
When to use Type 2
Type 2 slowly changing
dimension should be used when it is necessary for the data warehouse to track
historical changes.
Example:2
Type 2 - Creating
a new additional record. In this methodology all history of dimension changes
is kept in the database. You capture attribute change by adding a new row with
a new surrogate key to the dimension table. Both the prior and new rows contain
as attributes the natural key(or other durable identifier). Also 'effective
date' and 'current indicator' columns are used in this method. There could be
only one record with current indicator set to 'Y'. For 'effective date'
columns, i.e. start_date and end_date, the end_date for current record usually
is set to value 9999-12-31. Introducing changes to the dimensional model in
type 2 could be very expensive database operation so it is not recommended to
use it in dimensions where a new attribute could be added in the future.
Before the change:
Before the change:
Customer_ID
|
Customer_Name
|
Customer_Type
|
Start_Date
|
End_Date
|
Current_Flag
|
1
|
Cust_1
|
Corporate
|
22-07-2010
|
31-12-9999
|
Y
|
After the change:
Customer_ID
|
Customer_Name
|
Customer_Type
|
Start_Date
|
End_Date
|
Current_Flag
|
1
|
Cust_1
|
Corporate
|
22-07-2010
|
17-05-2012
|
N
|
2
|
Cust_1
|
Retail
|
18-05-2012
|
31-12-9999
|
Y
|
SCD Type 3, Slowly
Changing Dimension Use, Example, Advantage, Disadvantage
In Type 3 Slowly Changing
Dimension, there will be two columns to indicate the particular attribute of
interest, one indicating the original value, and one indicating the current
value. There will also be a column that indicates when the current value
becomes active.
In our example, recall we
originally have the following table:
Customer Key
|
Name
|
State
|
1001
|
Williams
|
New
York
|
To accommodate Type 3 Slowly
Changing Dimension, we will now have the following columns:
·
Customer Key
·
Name
·
Original State
·
Current State
·
Effective Date
After Williams moved from New
York to Los Angeles, the original information gets updated, and we have the
following table (assuming the effective date of change is February 20, 2010):
Customer Key
|
Name
|
Original State
|
Current State
|
Effective Date
|
1001
|
Williams
|
New
York
|
Los
Angeles
|
20-FEB-2010
|
Advantages
·
This does not increase the size of the table, since new
information is updated.
·
This allows us to keep some part of history.
Disadvantages
·
Type 3 will not be able to keep all history where an attribute is
changed more than once. For example, if Williams later moves to Texas on
December 15, 2003, the Los Angeles information will be lost.
When to use Type 3
Type III slowly changing
dimension should only be used when it is necessary for the data warehouse to
track historical changes, and when such changes will only occur for a finite
number of time.
Example:
2
Type 3 - Adding a
new column. In this type usually only the current and previous value of
dimension is kept in the database. The new value is loaded into 'current/new'
column and the old one into 'old/previous' column. Generally speaking the
history is limited to the number of column created for storing historical data.
This is the least commonly needed techinque.
Before the change:
Before the change:
Customer_ID
|
Customer_Name
|
Current_Type
|
Previous_Type
|
1
|
Cust_1
|
Corporate
|
Corporate
|
After the change:
Customer_ID
|
Customer_Name
|
Current_Type
|
Previous_Type
|
1
|
Cust_1
|
Retail
|
Corporate
|
Type 4 - Using historical table. In this method a separate
historical table is used to track all dimension's attribute historical changes
for each of the dimension. The 'main' dimension table keeps only the current
data e.g. customer and customer_history tables.
Current table:
Current table:
Customer_ID
|
Customer_Name
|
Customer_Type
|
1
|
Cust_1
|
Corporate
|
Historical table:
Customer_ID
|
Customer_Name
|
Customer_Type
|
Start_Date
|
End_Date
|
1
|
Cust_1
|
Retail
|
01-01-2010
|
21-07-2010
|
1
|
Cust_1
|
Oher
|
22-07-2010
|
17-05-2012
|
1
|
Cust_1
|
Corporate
|
18-05-2012
|
31-12-9999
|
Type 6 - Combine approaches of types 1,2,3 (1+2+3=6). In this type we have in dimension table such additional columns as:
· current_type - for keeping current value of the attribute.
All history records for given item of attribute have the same current value.
· historical type - for keeping historical value of the
attribute. All history records for given item of attribute could have different
values.
· start_date - for
keeping start date of 'effective date' of attribute's history.
· end_date - for
keeping end date of 'effective date' of attribute's history.
· current_flag - for
keeping information about the most recent record.
In this method to capture attribute change we add a new record as in type 2. The current_type information is overwritten with the new one as in type 1. We store the history in a historical_column as in type 3.
In this method to capture attribute change we add a new record as in type 2. The current_type information is overwritten with the new one as in type 1. We store the history in a historical_column as in type 3.
Customer_ID
|
Customer_Name
|
Current_Type
|
Historical_Type
|
Start_Date
|
End_Date
|
Current_Flag
|
1
|
Cust_1
|
Corporate
|
Retail
|
01-01-2010
|
21-07-2010
|
N
|
2
|
Cust_1
|
Corporate
|
Other
|
22-07-2010
|
17-05-2012
|
N
|
3
|
Cust_1
|
Corporate
|
Corporate
|
18-05-2012
|
31-12-9999
|
Y
|
WHAT IS ROW-LEVEL SECURITY?
Row level Security is about constraining access to data for a
database user at a very granular level.
By default the SQL Server relational engine can only limit user
access to a certain level of granularity. For example SQL Server can limit
access to a table and columns, but does not provide security within tables
(i.e. based on the data contained within the table). Said another way, the
grain of SQL Server Security is at the object level. What we want is access at
a more granular level (at the row level).
Since SQL Server does not have native support for row-level
Security we have to build a model to support the needs of the business. There
are many different approaches to implementing row-level security, but the one
we will cover in this article is through the use of custom views.
Gathering Statistics
Understanding Statistics
Statistics quantify the data distribution and storage
characteristics of tables, columns, indexes, and partitions.
You
must gather statistics on a regular basis to provide the optimizer with
information about schema objects. New statistics should be gathered after a
schema object's data or structure are modified in ways that make the previous
statistics inaccurate. For example, after loading a significant number of rows
into a table, you should collect new statistics on the number of rows. After
updating data in a table, you do not need to collect new statistics on the
number of rows but you might need new statistics on the average row length.
Generating Statistics
Because the cost-based approach relies on statistics, you should
generate statistics for all tables and clusters and all types of indexes
accessed by your SQL statements before using the cost-based approach. If the
size and data distribution of your tables change frequently, then you should
generate these statistics regularly to ensure the statistics accurately
represent the data in the tables.
- Estimation
based on random data sampling
- Exact
computation
- User-defined
statistics collection methods
Gathering Statistics with the
DBMS_STATS Package
Procedure
|
|
|
|
|
|
|
|
|
Bulk load
and normal load
Bulk loading, improves the performance of a session that inserts
a large amount of data to the target database. When bulk loading, the
Integration Service bypasses the database log, which speeds performance.
Without writing to the database log, however, the target database cannot
perform rollback. As a result, you may not be able to perform recovery. Bulk load is used when you are loading large
amount of data (more than 100GB). Bulk load makes the data loading faster.
In normal load the database log is not bypassed and therefore the target database can recover from an incomplete session.
In normal load the database log is not bypassed and therefore the target database can recover from an incomplete session.
[Informatica] Bulk Load vs Normal Load
While loading data to relational targets, Informatica provides
an option to either do Bulk load or Normal load. Selection of appropriate
option can not only improve your session performance but also can make your
rollback strategy foolproof.
Bulk load should be used in case where the data is humongous in
size (greater than 100 GB), and the operation is performance intensive. Bulk
load can greatly speed up your session performance in such a case.
However, the trade-off with Bulk load is that, it bypasses
database logs, and thus recovery in case of failure is not possible. On the
other hand, Normal load logs each and every database transaction while loading,
thus enabling rollback.
Also, while using Bulk load, one cannot have Indexes defined on
target. Bulk load fails in such a case. One must drop the Indexes, do Bulk load
and then recreate the Indexes in such a case. Other option id to use Normal
load.
HINTS
In various SQL implementations, a hint is an addition to the SQL standard that instructs the database
engine on how to execute the query. For example, a hint may tell the engine to use or not to
use an index (even if the query optimizer would decide otherwise).
Types of hints:
Join Order Hints
In case you believe a different
join order to be useful, you can use one of the join order hints: ORDEREDor LEADING. The
latter is more versatile and should thus be preferred.
ORDERED takes no parameters and instructs the
optimizer to join the tables in the order as they appear in the FROM clause. Because the ORDERED hint is so basic and you do not want to
move around tables in the FROM clause,
Oracle has provided us with the LEADING hint.
It takes the table names or aliases (if specified) as parameters, separated by
spaces.
Join Operation
Hints
Join operation hints are also
paired:
·
USE_HASH / NO_USE_HASH
·
USE_MERGE / NO_USE_MERGE
·
USE_NL / NO_USE_NL
These hints allow you to
instruct the optimizer to use a hash join, a sort-merge join, or nested loops,
respectively.
Hash joins support input
swapping, which we have discussed when we talked about left-deep and
right-deep join trees. This can be accomplished with SWAP_JOIN_INPUTS or prohibited with NO_SWAP_JOIN_INPUTS.
Parallel
Execution Hints
Not all SQL statements can be
run in parallel. All DML statements, including subqueries, can be run in
parallel, which means that multiple blocks can be selected, inserted, deleted,
or updated simultaneously. For parallelized DDL statements, multiple blocks are
being created/altered and written in parallel. The DDL statements that can be
run in parallel are:
·
CREATE INDEX
·
CREATE TABLE ... AS SELECT
·
ALTER INDEX ... REBUILD
·
ALTER INDEX ... [ REBUILD | SPLIT ] PARTITION
·
ALTER TABLE ... MOVE
·
ALTER TABLE ... [ MOVE | SPLIT | COALESCE ] PARTITION
Access Path Hints
Access path hints
determine how Oracle accesses the data you require. They can be divided into
two groups: access path hints for tables and access path hints for indexes.
Tables
The most prominent hint
in this group is the FULL( tab_name ) hint.
It instructs the optimizer to access a table by means of a full table scan. If
the table you want Oracle to access with a full table scan has an alias in the
SQL statement, you have to use the alias rather than the table name (without
the schema name) as the parameter to FULL. For named
query blocks you have to provide the query block’s name as discussed
previously.
In this group are also
the CLUSTER and HASH hints,
but they apply only to tables in an indexed cluster and hash clusters
respectively.
Indexes
The hints in this group all
come in pairs:
·
INDEX / NO_INDEX
·
INDEX_ASC / INDEX_DESC
·
INDEX_FFS / NO_INDEX_FFS
·
INDEX_SS / NO_INDEX_SS
·
INDEX_SS_ASC / INDEX_SS_DESC
·
INDEX_COMBINE / INDEX_JOIN
All these hints take at least
one parameter: the table name or alias in the SQL statement. A second
parameter, the index name(s), is optional but often provided. If more than one
index is provided, the indexes are separated by at least one space; the INDEX_COMBINE hint is recommended for this use case
though.
SDE and SIL mapping
SDE_* - Workflow loads data
into the staging table. Staging table will
not have any index defined on them and always the load into the staging
table will be a truncate and load. Since there are no index defined "Bulk"
load mode is being used.
not have any index defined on them and always the load into the staging
table will be a truncate and load. Since there are no index defined "Bulk"
load mode is being used.
SIL(Source Independent
loading)
SIL_* - You can find two
SIL_* workflows in the Workflow Manager. During
a full load the workflow (SIL_*_Full) will be invoked from DAC and the
target table is truncate ,indexes are dropped ,data is loaded in 'Bulk' mode
and then the indexes are rebuild. During incremental load, the workflow
(SIL_*) will be invoked which would just insert/update the incremental
data. Since the amount of data during incremental load is minimal compared to
full load this would run in 'Normal' mode. If indexes are present on a table
"BULK" mode does not work.
a full load the workflow (SIL_*_Full) will be invoked from DAC and the
target table is truncate ,indexes are dropped ,data is loaded in 'Bulk' mode
and then the indexes are rebuild. During incremental load, the workflow
(SIL_*) will be invoked which would just insert/update the incremental
data. Since the amount of data during incremental load is minimal compared to
full load this would run in 'Normal' mode. If indexes are present on a table
"BULK" mode does not work.
Comments
Post a Comment