Chapter 8 : Accessing Organizational Information – Data Warehouse

/
0 Comments

HISTORY OF DATA WAREHOUSING
  • Data warehouses extend the transformation of data into information
  • In the 1990’s executives became less concerned with the day-to-day business operations and more concerned with overall business functions
  • The data warehouse provided the ability to support decision making without disrupting the day-to-day operations

DATA WAREHOUSE FUNDAMENTALS
  • Data warehouse – a logical collection of information – gathered from many different operational databases – that supports business analysis activities and decision-making tasks
  • The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes
  • The primary difference between a database and a data warehouse is that a database stores information for a single application, whereas a data warehouse stores information from multiple databases, or multiple applications, and external information such as industry information 
  • This enables cross-functional analysis, industry analysis, market analysis, etc., all from a single repository
  • Data warehouses support only analytical processing (OLAP)
  • Extraction, transformation, and loading (ETL) – a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse
  • The ETL process gathers data from the internal and external databases and passes it to the data warehouse
  • The ETL process also gathers data from the data warehouse and passes it to the data marts
Data mart – contains a subset of data warehouse information


  • The data warehouse modeled in the above figure compiles information from internal databases or transactional/operational databases and external databases through ETL
  • It then send subsets of information to the data marts through the ETL process

MULTIDIMENSIONAL ANALYSIS AND DATA MININ
  • Databases contain information in a series of two-dimensional tables
  • In a data warehouse and data mart, information is multidimensional, it contains layers of columns and rows
  • Dimension – a particular attribute of information
  • Each layer in a data warehouse or data mart represents information according to an additional dimension
  • Dimensions could include such things as:
    • Products
    • Promotions
    • Stores
    • Category
    • Region
    • Stock price
    • Date
    • Time
    • Weather
  • Why is the ability to look at information based on different dimensions critical to a business success?
    • Ans:  The ability to look at information from different dimensions can add tremendous business insight
    • By slicing-and-dicing the information a business can uncover great unexpected insights
  • Cube – common term for the representation of multidimensional information



  • Users can slice and dice the cube to drill down into the information
  • Cube A represents store information (the layers), product information (the rows), and promotion information (the columns)
  • Cube B represents a slice of information displaying promotion II for all products at all storesCube C represents a slice of information displaying promotion III for product B at store 2
  • Data mining – the process of analyzing data to extract information not offered by the raw data alone
  • Data mining can begin at a summary information level (coarse granularity) and progress through increasing levels of detail (drilling down), or the reverse (drilling up)To perform data mining users need data-mining tools
  • Data-mining tool – uses a variety of techniques to find patterns and relationships in large volumes of information and infers rules that predict future behavior and guide decision making
  • Data-mining tools include query tools, reporting tools, multidimensional analysis tools, statistical tools, and intelligent agents

INFORMATION CLEANSING OR SCRUBBING
          An organization must maintain high-quality data in the data warehouse
          Information cleansing or scrubbing – a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
          Contact information in an operational system


Taking a look at customer information highlights why information cleansing and scrubbing is necessary
Customer information exists in several operational systems
In each system all details of this customer information could change form the customer ID to contact information
Determining which contact information is accurate and correct for this customer depends on the business process that is being executed


          Standardizing Customer name from Operational Systems



          Information cleansing activities


          Accurate and complete information


          Why do you think most businesses cannot achieve 100% accurate and complete information?
          If they had to choose a percentage for acceptable information what would it be and why?
§  Some companies are willing to go as low as 20% complete just to find business intelligence
§  Few organizations will go below 50% accurate – the information is useless if it is not accurate
          Achieving perfect information is almost impossible
§  The more complete and accurate an organization wants to get its information, the more it costs
§  The tradeoff between perfect information lies in accuracy verses completeness
§  Accurate information means it is correct, while complete information means there are no blanks
§  Most organizations determine a percentage high enough to make good decisions at a reasonable cost, such as 85% accurate and 65% complete

BUSINESS INTELLIGENCE
BI is information that people use to support their decision-making efforts
Principle BI enablers include:
          Technology
          Even the smallest company with BI software can do sophisticated analyses today that were unavailable to the largest organizations a generation ago. The largest companies today can create enterprisewide BI systems that compute and monitor metrics on virtually every variable important for managing the company. How is this possible? The answer is technology—the most significant enabler of business intelligence.
          People
          Understanding the role of people in BI allows organizations to systematically create insight and turn these insights into actions. Organizations can improve their decision making by having the right people making the decisions. This usually means a manager who is in the field and close to the customer rather than an analyst rich in data but poor in experience. In recent years “business intelligence for the masses” has been an important trend, and many organizations have made great strides in providing sophisticated yet simple analytical tools and information to a much larger user population than previously possible.
          Culture
          A key responsibility of executives is to shape and manage corporate culture. The extent to which the BI attitude flourishes in an organization depends in large part on the organization’s culture. Perhaps the most important step an organization can take to encourage BI is to measure the performance of the organization against a set of key indicators. The actions of publishing what the organization thinks are the most important indicators, measuring these indicators, and analyzing the results to guide improvement display a strong commitment to BI throughout the organization.

The End of Chapter 8 : Accessing Organizational Information – Data Warehouse by syahirahzfri. 
Thank you for reading :)



You may also like

No comments: