收藏 分享(赏)

_Transportation_Refreshing Warehouse Data 数据仓库英文oracle99版 教学课件.ppt

上传人:微传9988 文档编号:2266344 上传时间:2018-09-08 格式:PPT 页数:38 大小:1.11MB
下载 相关 举报
_Transportation_Refreshing Warehouse Data   数据仓库英文oracle99版 教学课件.ppt_第1页
第1页 / 共38页
_Transportation_Refreshing Warehouse Data   数据仓库英文oracle99版 教学课件.ppt_第2页
第2页 / 共38页
_Transportation_Refreshing Warehouse Data   数据仓库英文oracle99版 教学课件.ppt_第3页
第3页 / 共38页
_Transportation_Refreshing Warehouse Data   数据仓库英文oracle99版 教学课件.ppt_第4页
第4页 / 共38页
_Transportation_Refreshing Warehouse Data   数据仓库英文oracle99版 教学课件.ppt_第5页
第5页 / 共38页
点击查看更多>>
资源描述

1、Transportation: Refreshing Warehouse Data,Overview,Objectives,After completing this lesson, you should be able to do the following: Describe methods for capturing changed data Explain techniques for applying the changes Discuss techniques for purging and archiving data Outline final tasks, such as p

2、ublishing the data, controlling access, and automating processes List tools for transporting data into the warehouse,Developing a Refresh Strategy for Capturing Changed Data,Consider load window Identify data volumes Identify cycle Know the technical infrastructure Plan a staging area Determine how

3、to detect changes,T1,T2,T3,Operational databases,User Requirements and Assistance,Users define the refresh cycle IT balances requirements against technical issues Document all tasks and processes Employ user skills,T1,T2,T3,Operational databases,Load Window,Time available for entire ETT process Plan

4、 Test Prove Monitor,0 3 am 6 9 12 pm 3 6 9 12,User Access Period,Load Window,Load Window,Load Window,Plan and build processes according to a strategy. Consider volumes of data. Identify technical infrastructure. Ensure currency of data. Consider user access requirements first. High availability requ

5、irements may mean a small load window.,0 3 am 6 9 12 pm 3 6 9 12,User Access Period,Scheduling the Load Window,0 3 am,1,File 1,File 2,Receive data,Control FileFile namesFile typesNumber of filesNumber of loadsFirst-time load or refreshDate of fileDate rangeRecords in file - countsTotals - amounts,FT

6、P,Control process,4,Open and read files to verify and analyze,3,2,Requirements,Load cycle,Scheduling the Load Window,3 am 6 am 9 am,Load into warehouse,File 1,File 2,5,Verify, analyze, reapply,6,Create summaries,8,7,Index data,Update metadata,9,Parallel load,Scheduling the Load Window,6 am 9 am,Crea

7、te views for specialized tools,11,10,Back up warehouse,Users access summary data,12,Publish,13,User access,Capturing Changed Data for Refresh,Capture new fact data Capture changed dimension data Determine method for capture of each Methods: Wholesale data replacement Comparison of database instances

8、 Time stamping Database triggers Database log Hybrid techniques,Expensive Limited historical data, if any Data mart implementations Time period replacement,Wholesale Data Replacement,Comparison of Database Instances,Database comparison,Yesterdays operational database,Delta file holds changed data,Si

9、mple to perform, but expensive in time and processing Delta file: Changes to operational data since last refresh Used by various techniques,Todays operational database,Time and Date Stamping,Fast scanning for records changed since last extraction Date Updated field No detection of deleted data,Opera

10、tional data,Delta file holds changed data,Database Triggers,Changed data intersected at the server level Extra I/O required Maintenance overhead,Operational server (DBMS),Triggers on server,Trigger,Trigger,Trigger,Operational data,Delta file holds changed data,Using a Database Log,Contains before an

11、d after images Requires system checkpoint Common technique,Log,Log analysis and data extraction,Operational server (DBMS),Verdict,Consider each method on merit. Consider a hybrid approach if one approach is not suitable. Consider current technical, existing operational, and current application issue

12、s.,Applying the Changes to Data,You have a choice of techniques: Overwrite a record Add a record Add a field Maintain history Add version numbers,Overwriting a Record,Customer Id John Doe Single,.,.,Customer Id John Doe Married,Easy to implement Loses all history Not recommended,Adding a New Record,

13、1 Customer Id John Doe Single,History is preserved; dimensions grow. Time constraints are not required. Generalized key is created. Metadata tracks usage of keys.,Adding a Current Field,Customer Id John Doe Single,Customer Id John Doe Single Married 01-JAN-96,Maintains some history Loses intermediat

14、e values Is enhanced by adding an Effective Date field,Limitations of Methods for Applying Changes,Complete history impossible Dimensions may grow large Maintenance overhead,Maintaining History,Product,Time,Sales,HIST_CUST,CUSTOMER,One-to-many relationship Always retain current record Consistently a

15、ble to refer to record history,History Preserved,History enables realistic analysis. History retains context of data. History provides for realistic historical analysis. Model must be able to: Reflect business changes Maintain context between fact and dimension data Retain sufficient data to relate

16、old to new,Version Numbering,Avoid double counting Facts hold version number,Customer.CustId Version Customer Name 1234 1 Comer 1234 2 ComerSales.CustId Version Sales Facts 1234 1 11,000 1234 2 12,000,Customer,Sales,Product,Time,Purging and Archiving Data,As data ages, its value depreciates. Remove

17、old data from the warehouse: Archive for later use Purge without copy,Techniques for Purging Data,TRUNCATE: Retains no rollback DELETE: Retains redo and rollback ALTER TABLE: Removes a partition PL/SQL: Uses database triggers,Techniques for Archiving Data,Export to dump file from tables Import to ta

18、bles from dump file ALTER TABLE EXCHANGE partitions,EXP,.dmp,IMP,Verdict,Defined by business requirements Must be managed,Final Tasks,Update metadata ETT User Publish data Availability Changes Subject area basis Use database roles to prevent and allow access,Sources,Extract,Stage,Transform,Rules,Loa

19、d,Publish,Query,Publishing Data,Control access using database roles 24-hour operation may be requested Compromise between load and access Consider Staggering updates Using temporary tables Using separate tables,ETT Tool Selection Criteria,Overlap with existing tools Availability of meta model Suppor

20、ted data sources Ease of modification and maintenance Required fine tuning of code Ease of change control Power of transformation logic Level of modularization Power of error, exception, resubmission features Intuitive documentation Performance of code,ETT Tool Selection Criteria,Activity scheduling

21、 and sophistication Metadata generation Learning curve Flexibility Supported operating systems Cost,Transportation Tools,Informatica OpenBridge Oracle SQL*Loader Gateways PL/SQL Precompilers Platinum Technology InfoPump Platinum Info Transport,Replication Server Utilities,Oracle Symmetric and Hetero

22、geneous Replication,Gateways and Middleware,Brio Technology DataPrism Informatica Corporation OpenBridge Information Builders EDA/SQL Oracle Gateways Platinum Technology InfoHub Prism Prism Manager Software AG Entire Transaction Propagator,Summary,This lesson discussed the following topics: Capturin

23、g changed data Applying the changes Purging and archiving data Publishing the data, controlling access, and automating processes Identifying tools for transporting data into the warehouse,Practice 13-1 Overview,This practice covers the following topics: Identifying a series statements as true or false Answering a series of questions,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 中等教育 > 小学课件

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报