<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-2387670469462939744</id><updated>2011-09-03T03:44:50.322-07:00</updated><title type='text'>Business Intelligence Personal Blog</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>39</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-3279770799902901272</id><published>2011-08-03T08:12:00.001-07:00</published><updated>2011-08-03T08:13:33.751-07:00</updated><title type='text'>Looking for TCA Expert!!</title><content type='html'>Senior Business Data Quality Analyst&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Job Summary&lt;br /&gt;The Role:&lt;br /&gt;The Sr.lead Business Data quality Analyst will play a key global role in working with the Manager of Customer Data Architecture Integration in the development and implementation of long-term data quality strategy, which is focused on: analyzing core data to identify quality gaps, implementing reporting processes to track data quality improvements over time, enhancing current systems, defining and prioritizing high-value data quality projects and managing projects to completion by partnering with IT and Business Areas. Data quality is defined as 1) Accuracy of existing data, 2) coverage of existing data variables, and 3) data augmentation – identifying new data that should be collected. This position will also provide guidance to offshore data quality analysts.&lt;br /&gt;&lt;br /&gt;Primary Responsibilities:&lt;br /&gt;*Identify areas for data quality improvement and help to resolve data quality problems through the appropriate choice of error detection and correction, process control and improvement, or process design strategies.&lt;br /&gt;*Utilize data profiling and data quality tools, as well as with various data sources to uncover and determine root causes of data quality issues&lt;br /&gt;* Review ORACLE ERP, SFDC and other data sources for data accuracy&lt;br /&gt;* Recommend maintenance enhancements to data acquisition processes to improve data accuracy&lt;br /&gt;* Develop, document, and maintain data quality goals and standards.&lt;br /&gt;* Make recommendations for enhancements to systems of record to improve accuracy of operational data&lt;br /&gt;* Identify areas for data quality improvement and help to resolve data quality problems through the appropriate choice of error detections and correction, process control and improvement or process design strategies&lt;br /&gt;* Work with Data Quality Manager to establish a data quality methodology documenting a repeatable set of processes for determining, investigating, and resolving data quality issues, establishing an on-going process for maintaining quality data, and defining data quality audit procedures&lt;br /&gt;&lt;br /&gt;Job Requirements&lt;br /&gt;Required Skills &amp;amp; Experience:&lt;br /&gt;-Bachelor's degree (Computer Science, Information Management, Business, Economics, etc)&lt;br /&gt;-Strong knowledge of TCA and customer data management&lt;br /&gt;-Ability to multi-task, prioritize and coordinate tasks to meet multiple deadlines&lt;br /&gt;-Strong knowledge of Oracle database structure and SQL needed.&lt;br /&gt;-Ability to identify sets and subsets of information across multiple joins or unions of tables.&lt;br /&gt;-Experience with relational databases and statistical packages and analysis techniques.&lt;br /&gt;-Experience with Oracle ERP, Salesforce.com&lt;br /&gt;-Demonstrated ability to organize, coordinate, and execute on details&lt;br /&gt;-Demonstrated ability to communicate and interact with all levels an functions within an organization.Self-starter with the Initiative to identify and act upon opportunities without direction&lt;br /&gt;-Experience in managing projects through to completion&lt;br /&gt;-Flexible and resilient, comfortable with ambiguity, adaptable to a high-change environment&lt;br /&gt;-Attention to detail and strong personal organizational skills and ableto work in fast, paced, high volume environment&lt;br /&gt;-Problem solving capabilities for data management and continuous process improvement&lt;br /&gt;-Strong verbal and writing skills, project and time management skills,ability to work in teams&lt;br /&gt;-Good working knowledge of data quality measurements, total quality management, data entry improvement, and user requirements gathering.&lt;br /&gt;-Ability to manage change process, understand implications of data quality, measure cost and benefits of data quality, detect and correct errors in databases&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-3279770799902901272?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/3279770799902901272/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=3279770799902901272' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3279770799902901272'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3279770799902901272'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2011/08/looking-for-tca-expert.html' title='Looking for TCA Expert!!'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-924737929685278307</id><published>2010-12-06T20:32:00.000-08:00</published><updated>2010-12-06T20:36:55.688-08:00</updated><title type='text'>INTERNAL BLOGGING</title><content type='html'>Apologies for not blogging in the public domain. For  the last 1 year, I was actively blogging on company internal blogs. This is due to the sensitive nature of information i handle in Red hat.&lt;br /&gt;&lt;br /&gt;I hope I can carve some time at least next year after the Holidays.&lt;br /&gt;&lt;br /&gt;- HAPPY HOLIDAYS !!  &amp;amp; MERRY CHRISTMAS!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-924737929685278307?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/924737929685278307/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=924737929685278307' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/924737929685278307'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/924737929685278307'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2010/12/internal-blogging.html' title='INTERNAL BLOGGING'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-6347933793204615528</id><published>2009-11-15T06:48:00.000-08:00</published><updated>2009-11-15T06:51:00.550-08:00</updated><title type='text'>Gathering Business questions - Discovery Techniques</title><content type='html'>Working with the Business&lt;br /&gt;&lt;br /&gt;Using a single information need as a focal point ,analysis and brianstorming based on any of the following methods may be effective to acheive a robust list of business questions. Repeat the process for each of the informationneeds of interest.&lt;br /&gt;&lt;br /&gt;    * Stake holder Driven: - Work from the list of stakeholders identified in the program charter. Have each stakeholder express their individual interest in the information need,and the specific business questions that they would like to have answered.&lt;br /&gt;    * Goal Oriented - Ask individual stake holders 1) to examine the information need in context of business goals (WIGS or PIGS), 2) to describe how they can personally contribute to meeting the goals,and 3) to discuss the kinds of information that would help them do so.&lt;br /&gt;    * Process Oriented - Explore business processes that are related to or affected by the information need. Seek specific questions about business process components ( customers,products,subscriptions,events,activities, and actors etc)&lt;br /&gt;    * Measured Based - Examine the information need to identify a set of meaningful business measures. Express each of the measiures as a set of business questions. Consider measures based on finance,people and organizations,processes,markets and customers.&lt;br /&gt;    * Source Data Analysis - Examine data sources to identify questions that sources are able to answer. Extend the brianstorming to discuss those questions not being answered.&lt;br /&gt;    * Current Reports Analysis - As with data sources,examine existing reports to identify the questions are and are not being answered. Again, consider questions that need historical data.&lt;br /&gt;    * Surrogate System Analysis - Examine the systems,manual and otherwise, that stake holders use to get information not readily available from the core business systems. These include individually maintanined speadsheets and databases.&lt;br /&gt;    * Subject Analysis - When developing the warehouse subject model in parallel with identification of business questions, the subject model is useful foundation to explore business questions. Seek information about each subject that is responsive to the information need, and express as a set of business questions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-6347933793204615528?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/6347933793204615528/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=6347933793204615528' title='39 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6347933793204615528'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6347933793204615528'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2009/11/gathering-business-questions-discovery.html' title='Gathering Business questions - Discovery Techniques'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>39</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-857279442418632242</id><published>2009-11-15T06:42:00.000-08:00</published><updated>2009-11-15T06:48:02.616-08:00</updated><title type='text'>Data Requirements for advanced analytics</title><content type='html'>NUMBER ONE&lt;div class="jive-blog-post-body"&gt;&lt;div class="jive-blog-post-message"&gt;&lt;div class="jive-rendered-content"&gt;&lt;p style="text-align: justify;"&gt;&lt;span style="text-decoration: underline;"&gt;&lt;em&gt;&lt;strong&gt;Use advanced analytics to discover relationships and anticipate the future&lt;/strong&gt;&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="text-align: justify;"&gt;This involves discovering relationships, anticipating the future, and adapting to change. Working with the right data in the right condition is key to achieving these goals.&lt;/p&gt;&lt;p style="text-align: justify;"&gt;&lt;strong&gt;Discover relationships. &lt;/strong&gt;Whether advanced analytics is based on data mining, statistics, artificial intelligence, or complex queries, it can help you discover and quantify important relationships that you may have been unaware of. These relationships can reveal fraud, define customer segments, group products of affinity, and link field conditions that lead to product failures. The newly discovered relationships, in turn, help you reduce fraud and its&lt;br /&gt;costs, target marketing campaigns more accurately, develop effective merchandizing strategies, and improve product quality.&lt;/p&gt;&lt;p style="text-align: justify;"&gt;&lt;strong&gt;Anticipate the future.&lt;/strong&gt; Predictive analytics can produce scores and statistics through which you can predict the likelihood of various outcomes of certain situations. for example, predictive models quantify a customer’s proclivity to churn, thereby giving you an opportunity to retain the customer. Predictive models can assist with various types of forecasting. likewise, predictive analytics can quantify future risk for pragmatic applications.&lt;/p&gt;&lt;p style="text-align: justify;"&gt;&lt;strong&gt;Understand and adapt to change.&lt;/strong&gt; on the one hand, advanced analytics can help you understand change in the form of rising costs or new customer behaviors. on the other hand, the discoveries made through analytics can lead to positive changes that help your business adapt to an evolving world.&lt;/p&gt;&lt;p style="text-align: justify;"&gt;NUMBER TWO&lt;br /&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Scale up data integration to handle large analytic  data volumes&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;p style="text-align: justify;"&gt;NUMBER THREE&lt;/p&gt;&lt;p style="text-align: justify;"&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Realize that reporting and analytics have different  purposes and needs&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;p style="padding: 0px; min-height: 8pt; height: 8pt;"&gt; &lt;/p&gt;&lt;p style="text-align: justify;"&gt;Reporting and analytics are two different practices that have different  goals, methods, sponsors, funding, and enabling technologies. yet many people confuse the two,  platforms for business intelligence (BI) include functions for various types of reporting and summarized analysis in the form of online  analytic processing (OlAP)&lt;/p&gt;&lt;p style="padding: 0px; min-height: 8pt; height: 8pt;"&gt; &lt;/p&gt;&lt;p style="text-align: justify;"&gt;NUMBER FOUR&lt;br /&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Distinguish between data warehouses, data marts,  and analytic databases&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;table style="border: 1px solid rgb(0, 0, 0); width: 100%;" border="1" cellpadding="3" cellspacing="0"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th style="background-color: rgb(102, 144, 188);" align="center" valign="middle"&gt;&lt;br /&gt;&lt;/th&gt;&lt;th style="background-color: rgb(102, 144, 188);" align="center" valign="middle"&gt;&lt;span style="color: rgb(255, 255, 255);"&gt;&lt;strong&gt;Enterprise Data Warehouse&lt;/strong&gt;&lt;/span&gt;&lt;/th&gt;&lt;th style="background-color: rgb(102, 144, 188);" align="center" valign="middle"&gt;&lt;span style="color: rgb(255, 255, 255);"&gt;&lt;strong&gt;Data Mart&lt;/strong&gt;&lt;/span&gt;&lt;/th&gt;&lt;th style="background-color: rgb(102, 144, 188);" align="center" valign="middle"&gt;&lt;span style="color: rgb(255, 255, 255);"&gt;&lt;strong&gt;Analytic Database&lt;/strong&gt;&lt;/span&gt;&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Business Method&lt;/td&gt;&lt;td&gt;Single version of the truth for enterprise performance.&lt;/td&gt;&lt;td&gt;Single subject area(s) for application-specific purposes.&lt;/td&gt;&lt;td&gt;Test bed for exploring change&lt;br /&gt;and opportunity.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Optimization&lt;/td&gt;&lt;td&gt;Multiple update speeds, high performance,workload management, in-database analytics.&lt;/td&gt;&lt;td&gt;Regularly updated data for reporting,&lt;br /&gt;performance management, and OlAP.&lt;/td&gt;&lt;td&gt;Unpredictable data sets about changing&lt;br /&gt;markets, costs, customers, risks, etc.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Data Attributes&lt;/td&gt;&lt;td&gt;high standards for production data, plus inclusion of experimental data.&lt;/td&gt;&lt;td&gt;carefully transformed, cleansed,&lt;br /&gt;modeled, and audited.&lt;/td&gt;&lt;td&gt;less cleansed and modeled. often just&lt;br /&gt;raw source data.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Data Models&lt;/td&gt;&lt;td&gt;3NF data model to model the enterprise with views for application flexibility.&lt;/td&gt;&lt;td&gt;Relational models for reporting.&lt;br /&gt;Multi-dimensional models for olAP.&lt;/td&gt;&lt;td&gt;3nf of source data. Models demanded by analytic tools. Predictive models and scores.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Data Lifecycle&lt;/td&gt;&lt;td&gt;Permanent history with transient, elastic logical marts.&lt;/td&gt;&lt;td&gt;Permanent history of enterprise performance.&lt;/td&gt;&lt;td&gt;Data tends to be transient, as analytic&lt;br /&gt;needs change.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Data Acquisition&lt;/td&gt;&lt;td&gt;Well-governed process with the flexibility for self-provisioning elastic logical marts.&lt;/td&gt;&lt;td&gt;Slow process due to data Transformation,cleansing, modeling, audit trail, etc&lt;/td&gt;&lt;td&gt;Load data fast with little prep and start analysis immediately, regardless of state of data.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p style="padding: 0px; min-height: 8pt; height: 8pt;"&gt; &lt;/p&gt;&lt;p style="padding: 0px; min-height: 8pt; height: 8pt;"&gt; &lt;/p&gt;&lt;p style="text-align: justify;"&gt;NUMBER FIVE&lt;br /&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Design a data warehouse architecture that  accommodates analytics&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;p style="text-align: justify;"&gt;&lt;br /&gt;NUMBER SIX&lt;br /&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Prepare data to meet the needs of the analytic  method you’ve chosen&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;p style="text-align: justify;"&gt;&lt;br /&gt;NUMBER SEVEN&lt;br /&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Preserve analytic data’s rich details, because they  enable discovery&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;p style="text-align: justify;"&gt;&lt;br /&gt;NUMBER EIGHT&lt;br /&gt;&lt;strong&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;Improve data after working with it, not before&lt;/span&gt;&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;&lt;p style="padding: 0px; min-height: 8pt; height: 8pt;"&gt; &lt;/p&gt;&lt;p style="text-align: justify;"&gt;NUMBER NINE&lt;br /&gt;&lt;em&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Apply the products of advanced analytics to BI  and DW activities&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-857279442418632242?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/857279442418632242/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=857279442418632242' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/857279442418632242'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/857279442418632242'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2009/11/data-requirements-for-advanced.html' title='Data Requirements for advanced analytics'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-4518842216738786998</id><published>2009-03-10T10:21:00.000-07:00</published><updated>2009-03-10T10:22:28.427-07:00</updated><title type='text'>Balanced Scorecard - Executing Strategy</title><content type='html'>Some more updates...&lt;br /&gt;&lt;br /&gt;Scorecards  are to manage intangibles. We should not just focus financial metrics and loose insight to intangibles like Customers, processes,customer relationships, people culture etc.&lt;br /&gt;Having a financial perspective is good but not enough, Everything that contributes to the financial success should be considered.&lt;br /&gt;&lt;br /&gt;- Measures (Metrics - Dashboards, KPI's, operational indicators) should be linked to Strategy&lt;br /&gt;&lt;br /&gt;- We need to have a Strategy at a company level . Employees should understand strategy. Sharing of scorecards should be encouraged.&lt;br /&gt;&lt;br /&gt;One example stated is that the CEO will walk to a random employee with a strategy map ( laminated page/scorecard) and asks the employee to explain about it and what piece of it is the employee contributing...Strategy map will serve as a pathway to future.&lt;br /&gt;&lt;br /&gt;Feeders for Strategy Map: how can we be different and better than competitors etc. What processes are of concern etc.&lt;br /&gt;&lt;br /&gt;As in Marine Corp, soldiers are briefed with the mission and then make sure he/she has everything needed fo the battle ... When the soldier is in the battle field - ** Any thing can happen**  - We hope he/she will stick to the mission :-)&lt;br /&gt;&lt;br /&gt;Speaker stated: CEO's should not be protective about their strategy :-)&lt;br /&gt;&lt;br /&gt;Define a VISION STRATEGY and what are short term strategies to achieve the vision.&lt;br /&gt;&lt;br /&gt;- Conduct Monthly meeting about Strategy and DO NOT discuss operational in efficiencies&lt;br /&gt; OBJECTIVE:&lt;br /&gt;- Whats happening?&lt;br /&gt;- Measure&lt;br /&gt;- Trend&lt;br /&gt;- Why is it happening&lt;br /&gt;- Root Cause&lt;br /&gt;- Who is incharge&lt;br /&gt;- Who is the resource&lt;br /&gt;- Assign Accountability&lt;br /&gt;&lt;br /&gt;............more later&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-4518842216738786998?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/4518842216738786998/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=4518842216738786998' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4518842216738786998'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4518842216738786998'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2009/03/balanced-scorecard-executing-strategy.html' title='Balanced Scorecard - Executing Strategy'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-6625556451929940297</id><published>2009-03-09T21:41:00.000-07:00</published><updated>2009-03-09T22:09:27.966-07:00</updated><title type='text'>Defining Enterprise Metrics - Gartner Business Intelligence Summit 09  in Washington DC</title><content type='html'>The updates you see in the next few days  (until 11th) would be excerpts from my experiences in the summit. The blog entry can be very raw an uncut. ( my apologies) ** it's 12:46 a.m.&lt;br /&gt;&lt;br /&gt;The sessions are really useful and meeting Gartner experts on one-one meeting was very productive.&lt;br /&gt;&lt;br /&gt;Defining Enterprise Metrics ( Round Table)&lt;br /&gt;&lt;br /&gt;The round robin started and the organizer asked to introduce ourselves and what we are expecting from the session.&lt;br /&gt;"My update was given and said that we were successful in getting business buy in and are at a point to step back and do some score carding, Metrics (What business really needs - pain points-) Master data and Metadata"&lt;br /&gt;&lt;br /&gt;Ears just perked up when i said we got business buyin. I presenter said he would come back to me  on that topic.&lt;br /&gt;&lt;br /&gt;I gave my few cents about how we approached the business intelligence initiative, -Firstly i gave credit to KARMA :-) Right time right place.. Quick wins , -Evangelize - Inform users - Work Closely with Business etc.&lt;br /&gt;&lt;br /&gt;Overall it was a interesting session.&lt;br /&gt;&lt;br /&gt;Key Take aways..&lt;br /&gt;&lt;br /&gt;- We need to set up a Metrics framework ( refer PCF - Process Classification frame work)&lt;br /&gt;- WE need to identify process champions i.e., Process Facilitators not process Owners&lt;br /&gt;- Practice Performance Management not Implement&lt;br /&gt;- Let Business keep their own metrics, try to functionally understand the business and  tactfully work with the users to justify the  particular metric holds it's value.&lt;br /&gt;- An analogy of a volt meter across two nodes, we can use a metric ( volt meter in this case) at a process node and evaluate the potential value (voltage).&lt;br /&gt;- Work closely with Business Analysts, they should be accountable to produce process models and documention.&lt;br /&gt;- Review Process Classification Framework  American Productivity &amp; Quality Center (APQC)&lt;br /&gt;www.apqc.org&lt;br /&gt;- For any given metrics there can be a set of assumptions (variables).Does Business agree with the assumptions made???&lt;br /&gt;&lt;br /&gt;Book of Interest:&lt;br /&gt;Winning KPI's - David Parmenter&lt;br /&gt;&lt;br /&gt;Tool:&lt;br /&gt;Process Modeling &lt;br /&gt;http://www.visual-paradigm.com/product/vpuml/&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-6625556451929940297?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/6625556451929940297/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=6625556451929940297' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6625556451929940297'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6625556451929940297'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2009/03/defining-enterprise-metrics-gartner.html' title='Defining Enterprise Metrics - Gartner Business Intelligence Summit 09  in Washington DC'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-8653055915888660252</id><published>2009-01-16T17:52:00.000-08:00</published><updated>2009-01-16T18:16:25.028-08:00</updated><title type='text'>The Only Way to Build Trust Is to Have the Business Own</title><content type='html'>It’s one thing to understand the key indicators of a successful BI solution, but it’s another to roll up your sleeves and make the solution work. &lt;br /&gt;&lt;br /&gt;Six success factors to minimize project risks and increase the likelihood of success.&lt;br /&gt;      1.Establish a vision.&lt;br /&gt;      2.Evangelize the vision.&lt;br /&gt;      3.Prioritize the portfolio.&lt;br /&gt;      4.Allocate sufficient resources.&lt;br /&gt;      5.Align business and IT for the long haul.&lt;br /&gt;      6.Build trust in the system.&lt;br /&gt;&lt;br /&gt;These are the “bread and butter” issues in BI. Interestingly, the keys to success are not technical in nature. Projects don’t succeed because they use an innovative design or radical new&lt;br /&gt;technology. They succeed because of the “soft” stuff—leadership, communication,planning,the Solution and interpersonal relationships. Organizations must master these as much as the technical&lt;br /&gt;designs and tools required to deploy BI solutions.&lt;br /&gt;&lt;br /&gt;It’s interesting that almost all of our key indicators of success are non-technical in nature. All&lt;br /&gt;technical issues, including infrastructure and analytical tools, require business oversight and&lt;br /&gt;guidance to be implemented correctly.&lt;br /&gt;&lt;br /&gt;Business intelligence can provide significant value to your organization. It can provide high&lt;br /&gt;ROI and be a critical enabler of key business strategies and tactics for competing in an&lt;br /&gt;increasingly tough marketplace. By benchmarking your organization against our key success&lt;br /&gt;indicators and following our six critical success factors, your organization will be better able to extract value from BI.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-8653055915888660252?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/8653055915888660252/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=8653055915888660252' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/8653055915888660252'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/8653055915888660252'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2009/01/only-way-to-build-trust-is-to-have.html' title='The Only Way to Build Trust Is to Have the Business Own'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-5487179588530302420</id><published>2008-12-26T12:21:00.000-08:00</published><updated>2008-12-26T12:30:32.501-08:00</updated><title type='text'>Making Business Intelligence Work</title><content type='html'>Companies large and small can benefit from a well thought out business intelligence strategy, but developing that strategy can sometimes be more challenging than implementing a business intelligence program. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;First Steps&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Make sure that if you need historical data on your company or customers that you know where it is and how to access it. You’ll also need to take some time to understand your business needs and make a rough road map for how you would like your business to transform. Focus your attention on a few operational objective that can be achieved both short and long term.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Planning Your Strategy&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Start with your vision of where you would like your company to be in 3 to 5 years. Pick specific goals that can be achieved with the assistance of a good business intelligence program. These goals should have measurable success rates or key performance indicators. You will use your key performance indicators (KPIs) as milestones for how close you are to accomplishing your goals. You can then set up your information infrastructure to collect the necessary data that can be analyzed and turned into information, which can be used to make effective decisions for your company.&lt;br /&gt;&lt;br /&gt;In order to pick the most worthwhile goals, you must understand what the desired end result is. You will have to ask yourself what you will do with the business intelligence information your program generates once you have it. Otherwise, you will have a vast amount of information and no way to strategize around it. In companies without a comprehensive long-term plan, decision makers react to the information they get from their data without understanding the far-reaching consequences of their actions.&lt;br /&gt;&lt;br /&gt;Another important thing to think about before you implement your program is what your options are once you have the information. Often companies have an idea of how they want to move forward and are looking to the business intelligence information generated by their BI program to support their current strategy. They soon learn that the information is not always in line with their current plan. Brainstorm all possible outcomes of your business intelligence program and think of ways that your company can improve based on the different results. You probably won’t be able to come up with every single possibility, but you will be prepared to think creatively when your BI program starts to generate useful information.&lt;br /&gt;&lt;br /&gt;Lastly, remember why you are implementing this program. Business intelligence can be a huge asset to your business, but if you focus on the data and not on what the data can do for your company, your energy is misplaced. Before you become overwhelmed with data warehousing and data integration initiatives, take a step back and refocus on your company’s goals.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-5487179588530302420?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/5487179588530302420/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=5487179588530302420' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5487179588530302420'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5487179588530302420'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/12/making-business-intelligence-work.html' title='Making Business Intelligence Work'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-3371735408655260182</id><published>2008-12-14T17:50:00.000-08:00</published><updated>2008-12-14T18:05:36.008-08:00</updated><title type='text'>Data Warehouse Manager - An identity problem.</title><content type='html'>The data warehouse manager has been given control of one of the most valuable assets of any organization: the data. Furthermore, the data warehouse manager is expected to interpret and deliver that asset to the rest of the organization in a way that makes it most useful. &lt;br /&gt;&lt;br /&gt;In spite of all this visibility, many newly appointed data warehouse managers are simply given their titles without a clear job definition. It is not sufficient or helpful to just say that a data warehouse manager’s job is to "bring all the data into a central place and make it available for management to make decisions." Although such a definition may be correct, it isn’t precise enough for anyone to tell if the data manager has done the job well.&lt;br /&gt;&lt;br /&gt;A good metaphor for the job of the data warehouse manager is the job of an editor-in-chief. Think about what an editor-in-chief of books, magazines, or newspapers does. At a high level, the editor-in-chief:&lt;br /&gt;&lt;br /&gt;    * Collects input from a variety of sources, including third-party authors, investigative reporters, and in-house writers&lt;br /&gt;    * Assures the quality of this input by correcting spelling, removing mistakes, and removing questionable material&lt;br /&gt;    * Applies broad editorial control over the nature of the published material and assures a consistent editorial view&lt;br /&gt;    * Publishes on a regular schedule (certainly magazine and newspaper editors do)&lt;br /&gt;    * Relies on and respects the trust of the readers&lt;br /&gt;    * Is named prominently on the masthead to serve as a clear communication as to where the buck stops&lt;br /&gt;    * Is driven by continuously changing demographics and reader interests&lt;br /&gt;    * Is driven by the rapidly changing media technologies, especially the Internet revolution that is happening as we speak&lt;br /&gt;    * Is very aware of the power of the media and consciously markets the publications. &lt;br /&gt;&lt;br /&gt;These statements seem a little obvious because we all know, based on experience, what the job title "editor-in-chief" implies. And most editors-in-chief understand very clearly that they don’t create the content about which they write, report, or publish. They are, rather, the purveyors of content created by others.&lt;br /&gt;&lt;br /&gt;I hope that you have been struck by the many parallels between the job of an editor-in-chief and the job of a data warehouse manager. Perhaps a good way to sum this up is to say that the job of the data warehouse manager is to publish the enterprise’s data.&lt;br /&gt;&lt;br /&gt;Let’s examine the parallels between these two jobs. In most cases, the data warehouse manager is aggressively pursuing the same goals as the editor-in-chief. In some cases, the data warehouse manager could learn some useful things by emulating the editor-in-chief. At a high level, the data warehouse manager:&lt;br /&gt;&lt;br /&gt;    * Collects data inputs from a variety of sources, including legacy operational systems, third-party data suppliers, and informal sources&lt;br /&gt;    * Assures the quality of these data inputs by correcting spelling, removing mistakes, eliminating null data, and combining multiple sources&lt;br /&gt;    * Applies broad data stewardship over the nature of the published data and assures the use of conformed dimensions and facts across the disparate data marts (which can be thought of as separate publications)&lt;br /&gt;    * Releases the data from the data staging area to the individual data marts on a regular schedule&lt;br /&gt;    * Relies on and respects the trust of the end users&lt;br /&gt;    * Is named prominently on the organizational chart to serve as a clear communication as to where the buck stops&lt;br /&gt;    * Is driven by the continuously changing business requirements of the organization and the increasingly available sources of data&lt;br /&gt;    * Is driven by rapidly changing media technologies, especially the current Internet revolution&lt;br /&gt;    * Is very aware of the business significance of the data warehouse and consciously "captures" and takes credit for the business decisions made as a result of using the data warehouse. &lt;br /&gt;&lt;br /&gt;In addition, the data warehouse manager has a number of responsibilities that most editors-in-chief do not have to think about. These special data warehouse responsibilities include backing up all the data sources and all the final, published versions of the data. These backups must be available — sometimes on an emergency basis — to recover from disasters or provide detail that wasn’t published the first time around. The data warehouse manager must deal with overwhelming volumes of data and must diligently avoid being stranded by obsolete backups. The data warehouse manager must replicate published data in a highly synchronized way to each of the downstream "publications" (data marts) and provide a detailed audit trail of where the data came from and what its lineage and provenance is. The data warehouse manager must be able to explain the significance and true meaning of the data and justify each editing step that the data staging area may have performed on the data before it was published. The data warehouse manager must protect published data from all unauthorized readers. Of all the responsibilities the data warehouse manager has in addition to the classic editorial responsibilities, this security requirement is the most nightmarish; it is also the biggest departure from the editing metaphor. The data warehouse manager must somehow balance the goal of publishing the data to everyone with the goal of protecting the data from everyone. No wonder data warehouse managers have an identity problem.&lt;br /&gt;&lt;br /&gt;In the discussion of the traditional editor-in-chief’s set of responsibilities, we remarked that nearly all editors-in-chief understand that they are merely the purveyors of content created by others. Most editors don’t have a boundary problem in this area. Many data warehouse managers, on the other hand, do. Frequently, the data warehouse manager agrees to be responsible for allocations, forecasts, behavior scoring, modeling, or data mining. This is a major mistake! All these activities are content creation activities. It is understandable that the data warehouse manager is drawn into these activities, because, in many cases, there is no prior model for the division of the new responsibilities between IS and an end-user group such as finance. If an organization has never had good allocated costs, for example, and the data warehouse manager is asked to present these costs, then the data warehouse manager is also going to be expected to create these costs.&lt;br /&gt;&lt;br /&gt;The data warehouse manager should treat allocations, forecasts, behavior scoring, modeling, and data mining as clients of the data warehouse. These activities should be the responsibilities of various analytic groups in the finance and marketing departments, and these groups should have an arm’s length relationship to the data warehouse. They should consume warehouse data as inputs to their analytic activities and, possibly, engage the data warehouse to republish their results in a data mart when they are done. But these activities should not be mixed into all the mainline publishing activities of the data warehouse.&lt;br /&gt;&lt;br /&gt;Creating allocation rules that let you assign infrastructure costs to various product lines or various marketing initiatives is a political hot potato. It is easy for a data warehouse manager to get pulled into creating allocations because it is a necessary step in bringing up a profit-oriented data mart. The data warehouse manager should be aware of the possibility of this task being thrust on the data warehouse and should tell management that, for example, the "data warehouse will be glad to publish the allocation numbers once the finance department has created them." In this case, the editorial metaphor is a useful guide.&lt;br /&gt;&lt;br /&gt;**Source -An old Article of Ralph Kimball&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-3371735408655260182?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/3371735408655260182/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=3371735408655260182' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3371735408655260182'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3371735408655260182'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/12/data-warehouse-manager-identity-problem.html' title='Data Warehouse Manager - An identity problem.'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-3726873175054118873</id><published>2008-12-07T04:35:00.000-08:00</published><updated>2008-12-07T05:07:11.164-08:00</updated><title type='text'>Business Intelligence:  Nostradamus</title><content type='html'>1. 90% of BI will be delivered on mobile devices.&lt;br /&gt;   2. Much larger volumes of data, beyond terabytes, will need to be handled changing the hardware and software paradigm.&lt;br /&gt;   3. Information will be managed by business professionals and not technology professionals.&lt;br /&gt;   4. Do not listen to BI best practices that say "18 months to deliver" and "high costs for licenses/services". We need to think out of the box.&lt;br /&gt;   5. Envision a new role within an organization called the Information Architect (based from a Data Architect). This person understands the entire life cycle of information from where it comes from to where it ends up.&lt;br /&gt;   6. Better adoption of BI will happen when BI is integrated within productivity suites, like Office. BI will be an extension of business productivity.&lt;br /&gt;   7. BI should not be packaged with ERPs because the role for BI should be to integrate disparate systems. However Performance Management should be packaged with ERPs.&lt;br /&gt;   8. BI should be considered a service for business people.  Business is used to leveraging services already.  However IT is not yet matured to the point of understanding this role.&lt;br /&gt;   9. SaaS BI's success will be based on two serious roadblocks: Data Quality and Data Security.&lt;br /&gt;  10. Organizations will have 2 application footprints:  ERP systems for capturing information and IRP (information resource planning) systems for massaging and delivering information.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-3726873175054118873?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/3726873175054118873/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=3726873175054118873' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3726873175054118873'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3726873175054118873'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/12/business-intelligence-me-nostradamus.html' title='Business Intelligence:  Nostradamus'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-119810445101793569</id><published>2008-11-22T06:29:00.000-08:00</published><updated>2008-11-22T06:38:00.526-08:00</updated><title type='text'>Creating High Performing Teams</title><content type='html'>It is widely known and accepted that collaboration is quintessentially agile. What isn’t as widely understood is that true teamwork unlocks a level of collaboration that is hard to come by through any other approach. Even less known is that building, sustaining, and supporting high performing teams takes hard work. It takes an investment from each individual on the team, engagement and focus from management, and a supportive culture.&lt;br /&gt;&lt;br /&gt;First, let’s look at the business case behind teamwork. Why would you invest in creating great teams.&lt;br /&gt;Productivity/Speed — True teams have the ability to achieve very high levels of productivity. Teams are able to create work patterns, focus, and shared leadership. These dynamics allow teams to move very fast while maintaining high quality. They are able to achieve more with fewer people and enable the organization to change direction and respond to business demand more effectively.&lt;br /&gt;Innovation/Creativity — When teams reach a point where they can use conflict, and collaborative decision making to solve complex problems in unique ways. When you couple this innovative capability with customer collaboration (as Agile aims to do) you are able to deliver a differentiated business result.&lt;br /&gt;Associate Satisfaction — Most of us enjoy having friends/personal connections at work. Teams allow people the ability to create an environment that builds friendships and engagement beyond the work that they do.&lt;br /&gt;&lt;br /&gt;Set Teams Up for Success&lt;br /&gt;In my experience there are some very tangible and tactical ways to help teams take the right steps towards high performance early on in their formation.&lt;br /&gt;Outer Product — Teams need to have a clear vision and goal. They need to be able to answer why they are a team and why they need to achieve for the organization. This needs to be clearly articulated and made continuously visible. Have the team establish a mission, print it out, post it in the room, and revisit it on a regular basis.&lt;br /&gt;Inner Product – Establish a clear reason for each person to be part of the team. Each team member needs to be able to answer what they want personally from being a team. Do they want a better work/life balance, to learn new skills, or to earn promotion or good ratiing from their manager? Having team members understand each others motivation for being a team member creates a strong connection to each other.&lt;br /&gt;Celebrate Success — Celebrating success is not a random act of kindness from management like when they hold a happy hour because they sense low morale. Worse yet, it is not rewarding heroism and “death march” behavior. Celebrating success is having teams recognize specific achievements that they laid out as goals for themselves. This could be as small as a round of applause for the completion of a User Story or as complex as a team outing celebrating the successful implementation of a release. The point is that they should be for the team and tied to specific achievement of goals that the team set for themselves.&lt;br /&gt;Working Agreements — Teams need to collaborate of the creation of a set of agreements or ground rules that they will hold each other accountable for. These can range from working hours to agreements on feedback and communication styles.&lt;br /&gt;Personal Commitment — A key ingredient for effective teams is the personal commitment of each individual to the team. We go as far as asking each individual on the team what they would be willing to do to help the team become a strong team and achieve its goals.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-119810445101793569?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/119810445101793569/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=119810445101793569' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/119810445101793569'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/119810445101793569'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/11/creating-high-performing-teams.html' title='Creating High Performing Teams'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-426562198935352653</id><published>2008-11-12T18:51:00.000-08:00</published><updated>2008-11-12T18:54:24.142-08:00</updated><title type='text'>Business Intelligence - Actionable Insights</title><content type='html'>A few “actionable insights” &lt;br /&gt;&lt;br /&gt;1. Find the “right” person within the company who can help create insights. It may be your CEO ! If you cannot find such a person, don’t start the project&lt;br /&gt;2. Start small and scale up as you taste success; before you attempt to build the Taj Mahal, practice some smaller buildings&lt;br /&gt;3. Tools and technology matter, but in the end, the data quality makes the difference&lt;br /&gt;4. Do not be averse to restarting from scratch in case the first model or the second model does not deliver. It’s quicker to recreate than attempting to patch a bad one&lt;br /&gt;5. Keep on asking questions at every stage, “Why do you want this ?”, “How will it help you or the company ?”, “Who else can benefit from this ?”, you get the picture …..&lt;br /&gt;6. Whatever you do in stage 1 may need to be discarded by the time you are in stage 3 and that’s okay.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-426562198935352653?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/426562198935352653/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=426562198935352653' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/426562198935352653'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/426562198935352653'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/11/business-intelligence-actionable.html' title='Business Intelligence - Actionable Insights'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-3262780287819644858</id><published>2008-11-12T18:49:00.000-08:00</published><updated>2008-11-12T18:51:05.845-08:00</updated><title type='text'>Business Intelligence for beginners</title><content type='html'>Business intelligence (BI) is not business as usual. It’s about making better decisions easier and making them more quickly. Businesses collect enormous amounts of data every day: information about orders, inventory, accounts payable, point-of-sale transactions, and of course, customers. Businesses also acquire data, such as demographics and mailing lists, from outside sources. Unfortunately, based on a recent survey, over 93% of corporate data is not usable in the business decision-making process today.Consolidating and organizing data for better business decisions can lead to a competitive advantage, and learning to uncover and leverage those advantages is what business intelligence is all about.&lt;br /&gt;&lt;br /&gt;The amount of business data is increasing exponentially. In fact, it doubles every two to three years. More information means more competition. In the age of the information explosion, executives, managers, professionals, and workers all need to be able to make better decisions faster. Because now, more than ever, time is money.&lt;br /&gt;&lt;br /&gt;Business Intelligence solutions are not about bigger and better technology - they are about delivering more sophisticated information to the business end user. BI provides an easy-to-use, shareable resource that is powerful, cost-effective and scalable to your needs.&lt;br /&gt;&lt;br /&gt;Much more than a combination of data and technology, BI helps you to create knowledge from a world of information. Get the right data, discover its power, and share the value, BI transforms information into knowledge. Business Intelligence is the application of putting the right information into the hands of the right user at the right time to support the decision-making process.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-3262780287819644858?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/3262780287819644858/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=3262780287819644858' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3262780287819644858'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3262780287819644858'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/11/business-intelligence-for-beginners.html' title='Business Intelligence for beginners'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-27659536369376001</id><published>2008-11-12T18:48:00.000-08:00</published><updated>2008-11-12T18:49:23.475-08:00</updated><title type='text'>Business Intelligence Implementaion</title><content type='html'>Basically, a Business Intelligence project has to deal with three major topics:&lt;br /&gt;&lt;br /&gt;• Infrastructure&lt;br /&gt;• Data&lt;br /&gt;• Application&lt;br /&gt;&lt;br /&gt;• Infrastructure includes all the tasks necessary to provide the technical basis for the Business Intelligence environment. This includes the installation and implementation of new hardware and software, the connectivity between the legacy environment and the new Business Intelligence environment on a network, as well as on a database level, and the implementation of a population subsystem, an administration subsystem, and a management subsystem. Establishing the infrastructure for the first Business Intelligence solution is time consuming, but with the selection of scalable hardware and software components, the effort will decrease dramatically for the next project or delivery cycle.&lt;br /&gt;&lt;br /&gt;• Data deals with data access, mapping, derivation, transformation, and aggregation according to the requirements and business rules, as well as with the proper definition of the data items in business terms (metadata). It also contains the tasks necessary to ensure the consistency and quality of the information being transferred to the Business Intelligence environment. The effort for the tasks involved in the data topic should decrease with each new Business Intelligence project, depending on the amount of data that can be reused from previous projects (or iterations).&lt;br /&gt;&lt;br /&gt;• Application includes the gathering of the business requirements, the design of the model, and the implementation, visualization, and publication of the analysis results in terms of, for example, queries,&lt;br /&gt;reports, and charts. The effort needed for the tasks within the application topic is heavily dependent on the selected scope of the project.&lt;br /&gt;&lt;br /&gt;The scope of a Business Intelligence project should be selected in such a way that a complete solution (that is, infrastructure, data, and application) for the business analysis domain selected can be offered and valuable results can be delivered to the business analysts within a reasonable timeframe (no longer than six months).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-27659536369376001?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/27659536369376001/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=27659536369376001' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/27659536369376001'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/27659536369376001'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/11/business-intelligence-implementaion.html' title='Business Intelligence Implementaion'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-4064931571506164144</id><published>2008-11-09T04:28:00.000-08:00</published><updated>2008-11-09T04:33:02.519-08:00</updated><title type='text'>Predictive Analytics-A future Insight of Data Analysis</title><content type='html'>Predictive analytics encompasses a variety of techniques from statistics and data mining that analyze current and historical data to make predictions about future events.Such predictions rarely take the form of absolute statements, and are more likely to be expressed as values that correspond to the odds of a particular event or behavior taking place in the future.&lt;br /&gt;&lt;br /&gt;In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions.&lt;br /&gt;&lt;br /&gt;It is been used with the applications involving CRM,Cross-Selling,Direct marketing ,Collection analytics not only that even helps to detect Fraud detection in credit card Apps.&lt;br /&gt;&lt;br /&gt;The statistical techniques used in Predictive Analytics are as follows&lt;br /&gt;· Regression Techniques&lt;br /&gt;· Linear Regression Model&lt;br /&gt;· Discrete choice models&lt;br /&gt;· Logistic regression&lt;br /&gt;· Time series models&lt;br /&gt;&lt;br /&gt;Apart from these Statistical Techniques there are some Machine learning techniques are used such as Neural Networks and k-nearest neighbours&lt;br /&gt;The tool used to help with the execution of predictive analytics are SAS, S-Plus, SPSS and Stata and For machine learning/data mining type of applications, KnowledgeSEEKER, KnowledgeSTUDIO, Enterprise Miner, GeneXproTools, Clementine, KXEN Analytic Framework, InforSense&lt;br /&gt;are some of the popularly used options.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;WEKA is a freely available open-source &lt;/span&gt; collection of machine learning methods for pattern classification, regression, clustering, and some types of meta-learning, which can be used for predictive analytics.&lt;br /&gt;&lt;br /&gt;Recently Business Objects has announced a partnership with SPSS , a worldwide provider of predictive analytics software, announced the companies have entered into an original equipment manufacturer agreement in which Business Objects will offer its customers the ability to use SPSS predictive analytics data mining technology as part of the XI platform.&lt;br /&gt;&lt;br /&gt;Users of Business Objects XI with predictive analytics data mining technology will be able to leverage business predictions to make more informed decisions that can help generate revenue, control expenses, and mitigate risk.&lt;br /&gt;&lt;br /&gt;Today SAS, the leader in business intelligence, has significantly enhanced its award-wining SAS Enterprise Miner, SAS Text Miner, and SAS Forecast Server software, bringing predictive analytics to their highest level yet.&lt;br /&gt;&lt;br /&gt;The newest release of SAS Enterprise Miner improves productivity through added interactive advanced visualization and new analytics. Fifteen new analytical tools improve the resulting predictive models, which can mean significant savings for customers with proactive marketing departments such as in retail or banking.&lt;br /&gt;&lt;br /&gt;With innovative new modeling algorithms, including gradient boosting, partial least squares and support vector machines, SAS Enterprise Miner users can build more stable and more accurate models and thus make better decisions faster and with more confidence.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-4064931571506164144?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/4064931571506164144/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=4064931571506164144' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4064931571506164144'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4064931571506164144'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/11/predictive-analytics-future-insight-of.html' title='Predictive Analytics-A future Insight of Data Analysis'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-2544477675326935302</id><published>2008-11-07T17:29:00.000-08:00</published><updated>2008-11-07T17:31:57.315-08:00</updated><title type='text'>Star transform and Performance</title><content type='html'>As data warehousing people, we already know the top features that every data warehouse will use, such as partitioning, analytic SQL, bitmap indexes and much more. One of the features that we like is the star transformation. With careful reparation you can see incredible increases in performance. By creating tables intelligently and knowing the query patterns of your users, you can optimize how a table is deployed and its performance at the same time.&lt;br /&gt;&lt;br /&gt;Some things to think about when deploying fact and dimension tables:&lt;br /&gt;&lt;br /&gt;    * Base partition on the most logic separating column; this is generally a date column.&lt;br /&gt;    * Create bitmap indexes on individual columns.&lt;br /&gt;    * Define foreign key/primary key relationships, they do not have to be enabled, but the optimizer does need them defined.&lt;br /&gt;    * Use local indexes, or should I say minimize the use of global indexes, for maintenance reasons.&lt;br /&gt;    * Always use dbms_stats when analyzing a partitioned object.&lt;br /&gt;    * Build histograms for the optimizer.&lt;br /&gt;&lt;br /&gt;So this then leads us to the star transformation. The star transform is the optimizer's way of making the most of a query, but reducing the amount of data that is needed to be reviewed and retrieved by the database. The star transform can give you huge gains in query performance and is a feature that you need to review for your own warehouse. There are numerous initialization parameters that you need to set, which include:&lt;br /&gt;&lt;br /&gt;    * Star_transformation_enabled&lt;br /&gt;    * Hash_join_enabled&lt;br /&gt;    * Sort_area_size&lt;br /&gt;    * Sort_area_retained_size&lt;br /&gt;    * Bitmap_merge_area_size&lt;br /&gt;&lt;br /&gt;These parameters will all need to be tuned to achieve the full potential of star transformation. The bottom line is that you need to investigate how to best achieve performance gains for your own database, but this is one route that can help get you there.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-2544477675326935302?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/2544477675326935302/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=2544477675326935302' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/2544477675326935302'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/2544477675326935302'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/11/star-transform-and-performance.html' title='Star transform and Performance'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-5868657455100029022</id><published>2008-11-07T17:25:00.000-08:00</published><updated>2008-11-07T17:27:24.631-08:00</updated><title type='text'>What is Business Intelligence?</title><content type='html'>Business intelligence (BI) is more of a concept than a single technology. The goal is to gain insight into the business by bringing together data, formatting it in a way that enables better analysis, and then providing tools that give users power not just to examine and explore the data, but to quickly understand it.&lt;br /&gt;&lt;br /&gt;A list of sales numbers is not helpful unless it includes the products sold, when they were sold, where they were sold, and so on. It is important to include context when looking at data in order to turn it into information.&lt;br /&gt;&lt;br /&gt;While obtaining information is important, information is only useful if it is easy to grasp so that people can use it to make decisions. There is much information in books on nuclear physics or Cycladic statuary, but without the proper context and training such information can be hard to comprehend. It is therefore the goal to make data easy to comprehend; a quick grasp of the trends, relationships, and relative strengths and weaknesses is essential to delivering a usable system that truly delivers business value.&lt;br /&gt;&lt;br /&gt;The entire process of business intelligence can be broken into the following steps:&lt;br /&gt;&lt;br /&gt;1.       Identifying the business problem(s) to be addressed by the warehouse and the data needed to address those problems.&lt;br /&gt;&lt;br /&gt;2.       Identifying the location for all necessary data and extracting it from those sources.&lt;br /&gt;&lt;br /&gt;3.       Transforming the data from various sources into consolidated, consistent data.&lt;br /&gt;&lt;br /&gt;4.       Loading the transformed data into a centralized location.&lt;br /&gt;&lt;br /&gt;5.       Building a data warehouse (or data mart) with the data from the centralized location. The structure being built is called a cube.&lt;br /&gt;&lt;br /&gt;6.       Putting in place commercial products or custom applications that give access to the data in the cubes. There are many different ways of working with cube data, and different approaches make sense for different roles within an organization.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-5868657455100029022?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/5868657455100029022/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=5868657455100029022' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5868657455100029022'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5868657455100029022'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/11/what-is-business-intelligence.html' title='What is Business Intelligence?'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-6342731949985207656</id><published>2008-10-29T16:53:00.000-07:00</published><updated>2008-10-29T16:54:24.572-07:00</updated><title type='text'>My Wishlist!!</title><content type='html'>*  A data warehousing environment that provides the business with the information they need so users won't build data shadow systems. In the past I have blamed the business people for extracting data then creating hundreds of spreadsheets to build these systems. And, of course, I blamed Microsoft (via Excel) for being the enabler. But I found out that business people really don't want to build data shadows systems and spend all their time gathering, transforming and integrating data. They'd rather IT do it! But they do so because we haven't given them the information they need or the appropriate tools to leverage that information. Then they get into meetings and everyone argues over who has the right numbers. My wish is for our data warehouse help prevent that.&lt;br /&gt;    * A data governance program where business and IT groups are committed at the start of the new project and on an ongoing basis. I used to think that with the right tools we'd get our "single version of the truth" (and that's what the articles and analysts said), but I have realized that people, politics and culture matter more. There isn't any tool that is going to define all our data and performance metrics for us and then get consensus. Even with Web 2.0, SOA and Blackberries, we have to do it the old fashioned way – talk to people and get people to agree. Can you make all of these business and IT users to really commit to the project and see it through?&lt;br /&gt;    * A cost-effective tool to provide business intelligence. I used to think I wanted that single "best-of-breed" tool that is in upper right corner of Gartner's Magic Quadrants. Those tools are always selected in evaluations, so I thought they were the best. But our business people tried them, yawned and then went back to Microsoft Excel. We need a solution that includes Excel and provides BI capacities for reporting and analysis that are as easy as Microsoft Office. And my finance group "suggests" that the solution has to be cost-effective, i.e. no gargantuan enterprise license deals required. Are the elves developing anything like that in the workshop?&lt;br /&gt;    * A cost-effective and easy solution to distribute data from our data warehouse to data marts and OLAP cubes. While I was getting the ultimate, analyst-recommended data integration suite to perform ETL, EAI, EII, SOA, grid computing, parallel processing and everything else -- my developers went off and hand-coded all our data mart and OLAP cube loads. And our business groups loaded their data shadow systems using Microsoft Access. Please give me a simple-to-use tool that will get my developers to give up their hand-coding ways.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-6342731949985207656?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/6342731949985207656/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=6342731949985207656' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6342731949985207656'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6342731949985207656'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/10/my-wishlist.html' title='My Wishlist!!'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-8082379325565091282</id><published>2008-10-18T06:40:00.000-07:00</published><updated>2008-10-18T06:46:08.740-07:00</updated><title type='text'>Data Warehouse Control and Security</title><content type='html'>Imagine your organization has just built its data warehouse. The new data warehouse environment enables you to access corporate data in the form you want, when you want it, and where you want it to solve dynamic organizational problems, or make important decisions. You no longer feel frustrated with the inability of the Information Systems (IS) function to respond quickly to your diverse needs for information. The new environment empowers you to have the information processing world by the tail, and you are exceedingly thrilled by it all!&lt;br /&gt;&lt;br /&gt;Suddenly, a paranoid thought creeped into your head, and you asked the classic question: What is your organization doing to identify, classify, quantify, and protect its valuable information assets? You posed this question to the data warehouse architects and administrators. They told you that there was nothing to worry about because the in-built security measures of your data warehouse environment could put the DoD systems to shame. Somewhere along the lines, you sensed that they were neither objective and convincing.&lt;br /&gt;&lt;br /&gt;So, you put on your hacking hat and went about the process of finding the answer to your question. As a general user, you easily managed to access some powerful user tools that were presumably restricted to unlimited access users. The tools enabled you to issue complex queries which accessed numerous data, consumed enormous resources, and slowed system response time considerably. Your trusted friend, a reformed hacker, was also able to access sensitive corporate data through the Internet without much ado. He was able to disclose your exact salary, birth date, social security number, and the date of your last performance evaluation among other things.&lt;br /&gt;&lt;br /&gt;Your findings led you to the classic answer: Your organization, like most, is doing little or nothing to protect its strategic information assets! Your data warehouse administrators could not pinpoint the causes of recent system problems and security breaches until you showed them the shocking results of what you and your friend had done. It was then that they admitted that security was not a priority during the development of data warehouse. Driven by the needs to complete the data warehouse project on time and within budget, and get impatient users off their backs, they did not give security requirements any thought.&lt;br /&gt;&lt;br /&gt;Your euphoric excitement about the new data warehouse vanished into the thick air of security concerns over your valuable corporate data. As a diligent corporate steward, you realized that it is high time for your organization to take a reality check!&lt;br /&gt;Defining Data Warehouse&lt;br /&gt;&lt;br /&gt;Data warehouse (DW) is a collection of integrated databases designed to support managerial decision-making and problem-solving functions. It contains both highly detailed and summarized historical data relating to various categories, subjects, or areas. All units of data are relevant to appropriate time horizons. DW is an integral part of enterprise-wide decision support system, and does not ordinarily involve data updating. It empowers end-users to perform data access and analysis. This eliminates the need for the IS function to perform informational processing from the legacy systems for the end-users. It also gives an organization certain competitive advantages, such as: fostering a culture of information sharing; enabling employees to effectively and efficiently solve dynamic organizational problems; minimizing operating costs and maximizing revenue; attracting and maintaining market shares, and; minimizing the impact of employee turnovers.&lt;br /&gt;&lt;br /&gt;For instance, the internal audit functions of a multi-campus institution like the University of California builds a DW to facilitate the sharing of strategic data, best audit practices, and expert insights on a variety of control topics. Auditors can access and analyze the DW data to efficiently make well reasoned decisions (e.g., recommend cost-effective solutions to various internal control problems). Marrying DW architecture to artificial intelligence or neural applications also facilitates highly unstructured decision-making by the auditors. This results in timely completion of audit projects, improved quality of audit services, lower operating costs, and minimal impact from staff turnover. Implicit in the DW design is the concept of progress through sharing.&lt;br /&gt;&lt;br /&gt;The security requirements of the DW environment are not unlike those of other distributed computing systems. Thus, having an internal control mechanism to assure the confidentiality, integrity and availability of data in a distributed environment is of paramount importance. Unfortunately, most data warehouses are built with little or no consideration given to security during the development phase. Achieving proactive security requirements of DW is a seven-phase process: 1) identifying data, 2) classifying data, 3) quantifying the value of data, 4) identifying data security vulnerabilities, 5) identifying data protection measures and their costs, 6) selecting cost-effective security measures, and 7) evaluating the effectiveness of security measures. These phases are part of an enterprise-wide vulnerability assessment and management program.&lt;br /&gt;Phase One - Identifying the Data&lt;br /&gt;&lt;br /&gt;The first security task is to identify all digitally stored corporate data placed in the DW. This is an often ignored, but critical phase of meeting the security requirements of the DW environment since it forms the foundation for subsequent phases. It entails taking a complete inventory of all the data that is available to the DW end-users. The installed data monitoring software -- an important component of the DW -- can provide an accurate information about all databases, tables, columns, rows of data, and profiles of data residing in the DW environment as well as who is using the data and how often they use the data.&lt;br /&gt;&lt;br /&gt;A manual procedure would require preparing a checklist of the same information described above. Whether the required information is gathered through an automated or a manual method, the collected information needs to be organized, documented and retained for the next phase.&lt;br /&gt;Phase Two - Classifying the Data&lt;br /&gt;&lt;br /&gt;Classifying all the data in the DW environment is needed to satisfy security requirements for data confidentiality, integrity and availability in a prudent manner. In some cases, data classification is a legally mandated requirement. Performing this task requires the involvement of the data owners, custodians, and the end-users. Data is generally classified on the basis of criticality or sensitivity to disclosure, modification, and destruction. The sensitivity of corporate data can be classified as:&lt;br /&gt;&lt;br /&gt;    * PUBLIC (Least Sensitive Data): For data that is less sensitive than confidential corporate data. Data in this category is usually unclassified and subject to public disclosure by laws, common business practices, or company policies. All levels of the DW end-users can access this data (e.g., audited financial statements, admission information, phone directories, etc.). &lt;br /&gt;&lt;br /&gt;    * CONFIDENTIAL (Moderately Sensitive Data): For data that is more sensitive than public data, but less sensitive than top secret data. Data in this category is not subject to public disclosure. The principle of least privilege applies to this data classification category, and access to the data is limited to a need-to-know basis. Users can only access this data if it is needed to perform their work successfully (e.g., personnel/payroll information, medical history, investments, etc.). &lt;br /&gt;&lt;br /&gt;    * TOP SECRET (Most Sensitive Data): For data that is more sensitive than confidential data. Data in this category is highly sensitive and mission-critical. The principle of least privilege also applies to this category -- with access requirements much more stringent than those of the confidential data. Only high-level DW users (e.g., unlimited access) with proper security clearance can access this data (e.g., R&amp;D, new product lines, trade secrets, recruitment strategy, etc.). Users can access only the data needed to accomplish their critical job duties. &lt;br /&gt;&lt;br /&gt;Regardless of which categories are used to classify data on the basis of sensitivity, the universal goal of data classification is to rank data categories by increasing degrees of sensitivity so that different protective measures can be used for different categories. Classifying data into different categories is not as easy as it seems. Certain data represents a mixture of two or more categories depending on the context used (e.g., time, location, and laws in effect). Determining how to classify this kind of data is both challenging and interesting.&lt;br /&gt;Phase Three - Quantifying the Value of Data&lt;br /&gt;&lt;br /&gt;In most organizations, senior management demands to see the smoking gun (e.g., cost-vs-benefit figures, or hard evidence of committed frauds) before committing corporate funds to support security initiatives. Cynic managers will be quick to point out that they deal with hard reality -- not soft variables concocted hypothetically. Quantifying the value of sensitive data warranting protective measures is as close to the smoking gun as one can get to trigger senior management's support and commitment to security initiatives in the DW environment.&lt;br /&gt;&lt;br /&gt;The quantification process is primarily concerned about assigning "street value" to data grouped under different sensitivity categories. By itself, data has no intrinsic value. However, the definite value of data is often measurable by the cost to (a) reconstruct lost data, (a) restore the integrity of corrupted, fabricated, or intercepted data, (c) not make timely decisions due to denial of service, or (d) pay financial liability for public disclosure of confidential data. The data value may also include lost revenue from leakage of trade secrets to competitors, and advance use of secret financial data by rogue employees in the stock market prior to public release.&lt;br /&gt;&lt;br /&gt;Measuring the value of sensitive data is often a Herculean task. Some organizations use simple procedures for measuring the value of data. They build a spreadsheet application utilizing both qualitative and quantitative factors to reliably estimate the annualized loss expectancy (ALE) of data at risk. For instance, if it costs $10,000 annually (based on labor hours) to reconstruct data classified as top secret with assigned risk factor of 4, then the company should expect to lose at least $40,000 a year if this top secret data is not adequately protected. Similarly, if an employee is expected to successfully sue the company and recover $250,000 in punitive damages for public disclosure of privacy-protected personal information, then the liability cost plus legal fees paid to the lawyers can be used to calculate the value of the data. The risk factor (e.g., probability of occurrence) can be determined arbitrarily or quantitatively. The higher the likelihood of attacking a particular unit of data, the greater the risk factor assigned to that data set.&lt;br /&gt;&lt;br /&gt;Measuring the value of strategic information assets based on accepted classification categories can be used to show what an organization can save (e.g., Return on Investment) if the assets are properly protected, or lose (annual dollar loss) if it does not act to protect the valuable assets.&lt;br /&gt;Phase Four - Identifying Data Vulnerabilities&lt;br /&gt;&lt;br /&gt;This phase requires the identification and documentation of vulnerabilities associated with the DW environment. Some common vulnerabilities of DW include the following:&lt;br /&gt;&lt;br /&gt;    * In-built DBMS Security: Most data warehouses rely heavily on in-built security that is primarily VIEW-based. The VIEW-based security is inadequate for the DW because it can be easily bypassed by a direct dump of data. It also does not protect data during the transmission from servers to clients -- exposing the data to unauthorized access. The security feature is equally ineffective for the DW environment where the activities of the end-users are largely unpredictable. &lt;br /&gt;&lt;br /&gt;    * DBMS Limitations: Not all database systems housing the DW data have the capability to concurrently handle data of different sensitivity levels. Most organizations, for instance, use one DW server to process top secret and confidential data at the same time. However, the programs handling high top security data may not prevent leaking the data to the programs handling the confidential data, and limited DW users authorized to access only the confidential data may not be prevented from accessing the top secret data. &lt;br /&gt;&lt;br /&gt;    * Dual Security Engines: Some data warehouses combine the in-built DBMS security features with the operating system access control package to satisfy their security requirements. Using dual security engines tends to present opportunity for security lapses and exacerbate the complexity of security administration in the DW environment. &lt;br /&gt;&lt;br /&gt;    * Inference Attacks: Different access privileges are granted to different DW users. All users can access public data, but only a select few would presumably access confidential or top secret data. Unfortunately, general users can access protected data by inference without having a direct access to the protected data. Sensitive data is typically inferred from a seemingly non-sensitive data. Carrying out direct and indirect inference attacks is a common vulnerability in the DW environment. &lt;br /&gt;&lt;br /&gt;    * Availability Factor: Availability is a critical requirement upon which the shared access philosophy of the DW architecture is built. However, availability requirement can conflict with or compromise the confidentiality and integrity of the DW data if not carefully considered. &lt;br /&gt;&lt;br /&gt;    * Human Factors: Accidental and intentional acts such as errors, omissions, modifications, destruction, misuse, disclosure, sabotage, frauds, and negligence account for most of the costly losses incurred by organizations. These acts adversely affect the integrity, confidentiality, and availability of the DW data. &lt;br /&gt;&lt;br /&gt;    * Insider Threats: The DW users (employees) represent the greatest threat to valuable data. Disgruntled employees with legitimate access could leak secret data to competitors and publicly disclose certain confidential human resources data. Rogue employees can also profit from using strategic corporate data in the stock market before such information is released to the public. These activities cause (a) strained relationships with business partners or government entities, (b) loss of money from financial liabilities, (c) loss of public confidence in the organization, and (d) loss of competitive edge. &lt;br /&gt;&lt;br /&gt;    * Outsider Threats: Competitors and other outside parties pose similar threat to the DW environment as unethical insiders. These outsiders engage in electronic espionage and other hacking techniques to steal, buy, or gather strategic corporate data in the DW environment. Risks from these activities include (a) negative publicity which decimates the ability of a company to attract and retain customers or market shares, and (b) loss of continuity of DW resources which negates user productivity. The resultant losses tend to be higher than those of insider threats. &lt;br /&gt;&lt;br /&gt;    * Natural Factors: Fire, water, and air damages can render both the DW servers and clients unusable. Risks and losses vary from organization to organization -- depending mostly on location and contingency factors. &lt;br /&gt;&lt;br /&gt;    * Utility Factors: Interruption of electricity and communications service causes costly disruption to the DW environment. These factors have a lower probability of occurrence, but tend to result in excessive losses. &lt;br /&gt;&lt;br /&gt;A comprehensive inventory of vulnerabilities inherent in the DW environment need to be documented and organized (e.g., as major or minor) for the next phase.&lt;br /&gt;Phase Five - Identifying Protective Measures and Their Costs&lt;br /&gt;&lt;br /&gt;Vulnerabilities identified in the previous phase should be considered in order to determine cost-effective protection for the DW data at different sensitivity levels. Some protective measures for the DW data include:&lt;br /&gt;&lt;br /&gt;    * The Human Wall: Employees represent the front-line of defense against security vulnerabilities in any decentralized computing environment, including DW. Addressing employee hiring, training (security awareness), periodic background checks, transfers, and termination as part of the security requirements is helpful in creating security-conscious DW environment. This approach effectively treats the root causes, rather than the symptoms, of security problems. Human resources management costs are easily measurable. &lt;br /&gt;&lt;br /&gt;    * Access Users Classification: Classify data warehouse users as 1) General Access Users, 2) Limited Access Users, and 3) Unlimited Access Users for access control decisions. &lt;br /&gt;&lt;br /&gt;    * Access Controls: Use access controls policy based on principles of least privilege and adequate data protection. Enforce effective and efficient access control restrictions so that the end-users can access only the data or programs for which they have legitimate privileges. Corporate data must be protected to the degree consistent with its value. Users need to obtain a granulated security clearance before they are granted access to sensitive data. Also, access to the sensitive data should rely on more than one authentication mechanism. These access controls minimize damage from accidental and malicious attacks. &lt;br /&gt;&lt;br /&gt;    * Integrity Controls: Use a control mechanism to a) prevent all users from updating and deleting historical data in the DW, b) restrict data merge access to authorized activities only, c) immunize the DW data from power failures, system crashes and corruption, d) enable rapid recovery of data and operations in the event of disasters, and e) ensure the availability of consistent, reliable and timely data to the users. These are achieved through the OS integrity controls and well tested disaster recovery procedures. &lt;br /&gt;&lt;br /&gt;    * Data Encryption: Encrypting sensitive data in the DW ensures that the data is accessed on an authorized basis only. This nullifies the potential value of data interception, fabrication and modification. It also inhibits unauthorized dumping and interpretation of data, and enables secure authentication of users. In short, encryption ensures the confidentiality, integrity, and availability of data in the DW environment. &lt;br /&gt;&lt;br /&gt;    * Partitioning: Use a mechanism to partition sensitive data into separate tables so that only authorized users can access these tables based on legitimate needs. Partitioning scheme relies on a simple in-built DBMS security feature to prevent unauthorized access to sensitive data in the DW environment. However, use of this method presents some data redundancy problems. &lt;br /&gt;&lt;br /&gt;    * Development Controls: Use quality control standards to guide the development, testing and maintenance of the DW architecture. This approach ensures that security requirements are sufficiently addressed during and after the development phase. It also ensures that the system is highly elastic (e.g., adaptable or responsive to changing security needs). &lt;br /&gt;&lt;br /&gt;The estimated costs of each security measure should be determined and documented for the next phase. Commercial packages (e.g., CORA, RANK-IT, BUDDY SYSTEM, BDSS, BIA Professional, etc.) and in-house developed applications can help in identifying appropriate protective measures for known vulnerabilities, and quantifying their associated costs or fiscal impact. Measuring the costs usually involves determining the development, implementation, and maintenance costs of each security measure.&lt;br /&gt;Phase Six - Selecting Cost-Effective Security Measures&lt;br /&gt;&lt;br /&gt;All security measures involve expenses, and security expenses require justification. This phase relies on the results of previous phases to assess the fiscal impact of corporate data at risk, and select cost-effective security measures to safeguard the data against known vulnerabilities. Selecting cost-effective security measures is congruent with a prudent business practice which ensures that the costs of protecting the data at risk does not exceed the maximum dollar loss of the data. Senior management would, for instance, deem it imprudent to commit $500,000 annually in safeguarding the data with annualized loss expectancy of only $250,000.&lt;br /&gt;&lt;br /&gt;However, the cost factor should not be the only criterion for selecting appropriate security measures in the DW environment. Compatibility, adaptability and potential impact on the DW performance should also be taken into consideration. Additionally, there are two important factors. First, the economy of mechanism principle dictates that a simple, well tested protective measure can be relied upon to control multiple vulnerabilities in the DW environment. Second, data, unlike hardware and software, is an element in the IS security arena that has the shortest life span. Thus, the principle of adequate data protection dictates that the DW data can be protected with security measures that are effective and efficient enough for the short life span of the data.&lt;br /&gt;Phase Seven - Evaluating the Effectiveness of Security Measures&lt;br /&gt;&lt;br /&gt;A winning basketball formula from the John Wooden school of thought teaches that a good team should be prepared to rebound every shot that goes up, even if it is made by the greatest player on the court. Similarly, a winning security strategy is to assume that all security measures are breakable, or not permanently effective. Every time we identify and select cost-effective security measures to secure our strategic information assets against certain attacks, the attackers tend to double their efforts in identifying methods to defeat our implemented security measures. The best we can do is to prevent this from happening, make the attacks difficult to carry out, or be prepared to rebound quickly if our assets are attacked. We will not be well positioned to do any of these if we do not evaluate the effectiveness of security measures on an ongoing basis.&lt;br /&gt;&lt;br /&gt;Evaluating the effectiveness of security measures should be conducted continuously to determine whether the measures are: 1) small, simple and straightforward, 2) carefully analyzed, tested and verified, 3) used properly and selectively so that they do not exclude legitimate accesses, 4) elastic so that they can respond effectively to changing security requirements, and 5) reasonably efficient in terms of time, memory space, and user-centric activities so that they do not adversely affect the protected computing resources. It is equally important to ensure that the DW end-users understand and embrace the propriety of security measures through an effective security awareness program. The data warehouse administrator (DWA) with the delegated authority from senior management is responsible for ensuring the effectiveness of security measures.&lt;br /&gt;Encryption Requirements&lt;br /&gt;&lt;br /&gt;Encrypting sensitive data in the DW environment can be done at the table, column, or row level. Encrypting columns of a table containing sensitive data is the most common and straightforward approach used. Few examples of columns that are usually encrypted include social security numbers, salaries, birth dates, performance evaluation ratings, confidential bank information, and credit card numbers. Locating individual records in a table through a standard search command will be exceedingly difficult if any of the encrypted columns serve as keys to the table. Organizations that use social security numbers as key to database tables should seriously consider using alternative pseudonym codes (e.g., randomly generated numbers) as keys before encrypting the SSN column.&lt;br /&gt;&lt;br /&gt;Encrypting only selected rows of data is not commonly used, but can be useful in some unique cases. For instance, a single encryption algorithm can be used to encrypt the ages of some employees who insist on non-disclosure of their ages for privacy reasons. Multiple encryption algorithms can also be used to encrypt rows of data reflecting sensitive transactions for different campuses so that geographically distributed users of the same DW can only view/search transactions (rows) related to their respective campuses. If not carefully planned, mixing separate rows of encrypted and unencrypted data and managing multiple encryption algorithms in the same DW environment can introduce chaos, including flawed data search results.&lt;br /&gt;&lt;br /&gt;Encrypting a table (all columns/rows) is very rarely used because it essentially renders the data useless in the DW environment. The procedures required to decrypt the encrypted keys before accessing the records in a useful format are very cumbersome and cost-prohibitive.&lt;br /&gt;&lt;br /&gt;The encryption algorithm selected for the DW environment should be able to preserve field type and field length characteristics. It should also work cooperatively with the access and analysis software package in the DW environment. Specifically, the data decryption sequence must be executed before it reaches the software package handling the standard query. Otherwise, the package could prevent decryption of the encrypted data -- rendering the data useless.&lt;br /&gt;Encryption Constraints&lt;br /&gt;&lt;br /&gt;Performing data encryption and decryption on the DW server consumes significant CPU processing cycles. This results in excessive overhead costs and degraded system performance. Also, performing decryption on the DW server before transmitting the decrypted data to the client (end-user's workstation) exposes the data to unauthorized access during the transmission. These problems can be minimized if the encryption and decryption functions are effectively deployed to the workstation level with greater CPU cycles available for processing.&lt;br /&gt;&lt;br /&gt;In addition, improperly used encryption (e.g., weak encryption algorithm) can give users a false sense of security. Encrypted data in the DW must be decrypted before the standard query operations can be performed. This increases the time to process a query which can irritate the end-users and force them to be belligerent toward encryption mechanism. Finally, it is still illegal to use certain encryption algorithms outside the U.S. borders.&lt;br /&gt;Data Warehouse Administration&lt;br /&gt;&lt;br /&gt;The size of historical data in the DW environment grows significantly every year, while the use of the data tends to decrease dramatically. This increases storage, processing and operating costs of the DW annually. It necessitates the periodic phasing out of least used or unused data -- usually after a detailed analysis of the least and most accessed data over a long time horizon. A prudent decision has to be made as to how long historical data should be kept in the DW environment before they are phased out en mass. The DWA may not meet effectively these challenges without the necessary tools (activity and data monitors), resources (funds and staffing support) and management philosophy (strategic planning and management). For these reasons, the DWA should be a good strategist, an effective communicator, an astute politician, and a competent technician.&lt;br /&gt;Control Reviews&lt;br /&gt;&lt;br /&gt;The internal control review approach of the DW environment should be primarily forward-looking (emphasizing up-front prevention) as opposed to backward-looking (emphasizing after-the-fact verification). This approach calls for the use of pre-control and concurrent control assessment techniques to look at such issues as (a) data quality control, (b) effectiveness of security management, (c) economy and efficiency of DW operations, (d) accomplishment of operational goals or quality standards, and (e) overall DW administration. Effective collaboration with the internal customers (the DWA and en-users) and use of automated control tools are essential for conducting these control reviews competently.&lt;br /&gt;Conclusions&lt;br /&gt;&lt;br /&gt;The seven phases of systematic vulnerability assessment and management program described in this article are helpful in averting underprotection and overprotection (two undesirable security extremes) of the DW data. This is achieved through the eventual selection of cost-effective security measures which ensure that different categories of corporate data are protected to the degree necessary. The program also shifts the management focus from taking corrective security actions in a crisis mode to prevention of security crises in the DW environment.&lt;br /&gt;&lt;br /&gt;It is generally recognized that the goal of DW is to provide decision-makers access to consistent, reliable, and timely data for analytical, planning, and assessment purposes in a format that allows for easy retrieval, exploration and analysis. The need for accurate information in the most efficient and effective manner is congruent with the security requirements for data integrity and availability.&lt;br /&gt;&lt;br /&gt;Thus, it is a winning corporate strategy to ensure a happy marriage between the idealism of DW based on empowered informational processing, and the pragmatism of a proactive security philosophy based on prudent security practices in the empowered computing environment. The myth that security defeats the goal of DW, or cannot coexist in the DW environment should be debunked. Anything less would be imprudent.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-8082379325565091282?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/8082379325565091282/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=8082379325565091282' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/8082379325565091282'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/8082379325565091282'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/10/data-warehouse-control-and-security.html' title='Data Warehouse Control and Security'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-7564910797095246983</id><published>2008-06-19T05:44:00.000-07:00</published><updated>2008-06-19T08:15:22.068-07:00</updated><title type='text'>How to Influence Up</title><content type='html'>In any Company's Business Intelligence initiative.&lt;br /&gt;--- WE NEED EXECUTIVE BUY IN or EXEC BLESSING---&lt;br /&gt;&lt;br /&gt;How is the role of the leader changing?&lt;br /&gt;Integrity, customer satisfaction, competitive advantage, communicating a vision, you can guess this important for the leader and that was in the past Due to changing dynamics leadership in future is different from past.&lt;br /&gt;&lt;br /&gt;Differences.&lt;br /&gt;&lt;br /&gt;1 Thinking globally instead of domestic in terms of suppliers, support staff&lt;br /&gt;2 Appreciating cross cultural diversity. &lt;br /&gt;3.Technologically savvy, doesn't mean to that great leader to be a technician but how does technology impacts your core business, technically competent enough to navigate the new world &lt;br /&gt;4.Building alliances and partnerships building all kinds of different relationships&lt;br /&gt;5 Sharing leadership&lt;br /&gt;&lt;br /&gt;Historically a leader in the past -- ** HAVE TO TELL** KNOW MORE THAN DIRECT REPORTS**--&lt;br /&gt;In future -- ** HAVE TO ASK** Manage knowledge workers, who knows more than the BOSS.&lt;br /&gt;&lt;br /&gt;"Leaders who ask,listen follow up discipline way are more successful."&lt;br /&gt;&lt;br /&gt;Influencing the upper management:&lt;br /&gt;1. Who has the power to make the decision not who is pretty, good, nice etc.&lt;br /&gt;2. Think like a great sales person. it's not their responsibility to buy but sell.&lt;br /&gt;Next Sell to the higher level not to your needs but to their needs&lt;br /&gt;3.cost benefit analysis&lt;br /&gt;4. sell for the bigger needs of the business&lt;br /&gt;If you cant sell make peace.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;"Treat the manager with the same courtesy how you want to be treated"&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-7564910797095246983?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/7564910797095246983/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=7564910797095246983' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/7564910797095246983'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/7564910797095246983'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/06/how-to-influence-up.html' title='How to Influence Up'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-7759460834552754004</id><published>2008-06-13T18:06:00.000-07:00</published><updated>2008-06-13T18:17:11.992-07:00</updated><title type='text'>Business intelligence ROI and focus</title><content type='html'>The article, &lt;a href="http://searchdatamanagement.techtarget.com/news/article/0,289142,sid91_gci1305065,00.html?asrc=SS_CLA_307404&amp;psrc=CLT_91"&gt;Gartner: Business intelligence ROI, value a matter of mind over money&lt;/a&gt;, begins with "Determining the return on investment (ROI) and value of a business intelligence (BI) software investment is often an exasperating task, but not an impossible one, according to one Gartner analyst."&lt;br /&gt;&lt;br /&gt;I completely agree, but I also feel it's a matter of maturity, and mature BI environments can get there. I also believe it's a best practice to measure and that it has a high correlation to overall "success", whether success is defined by the numbers or otherwise.&lt;br /&gt;&lt;br /&gt;Following are some focuses, in order from healthiest to unhealthiest, that business intelligence programs fall into. As we progress through the focuses, you will notice the focus gets further and further away from the user.&lt;br /&gt;&lt;br /&gt;Business Focus #1: Return on Investment&lt;br /&gt;ROI is the holy grail of focus for business intelligence. Those teams that focus on achieving it have learned what business intelligence is all about. Studies have shown that driving toward ROI highly correlates to self-reported program success scores. The focus on ROI just seems to encourage the development team to work backwards to doing the right things day in and day out for the ultimate arbiter of success - the bottom line. Ultimately, to claim this focus, a team must have a great handle on the succeeding focuses well.&lt;br /&gt;&lt;br /&gt;Business Focus #2: Data Usage&lt;br /&gt;Those programs that don't measure ROI or are too removed from business processes that drive ROI but still want a business-focused BI program focus on the usage of the data. The objective here is increasing numbers and complexity of usage. With this focus, user statistics such as logins and query bands are tracked; however, little is understood about what the users ultimately do with the results.&lt;br /&gt;&lt;br /&gt;Business Focus #3: Data Gathering and Availability&lt;br /&gt;&lt;br /&gt;Under this focus, the business intelligence team becomes an internal data brokerage, serving up data for the organization's consumption. Users are not tracked because success is measured in the availability of the data.&lt;br /&gt;&lt;br /&gt;In these environments so removed from usage, it is often a struggle for the users to leverage the data. It is not unusual to find a host of downstream processes (i.e., Excel, Access) operating to "fix," "clean" and make this data usable. Users may have grass roots efforts underway to utilize each other's "code."&lt;br /&gt;&lt;br /&gt;These environments often come about when there is high complexity in the data extraction and movement layer of the architecture. While it's an accomplishment to deliver the data in these environments, the team should not neglect the need to deliver business intelligence, which requires the accoutrements related to usage to be in place -- such as governance, stewardship and a public relations program.&lt;br /&gt;&lt;br /&gt;User satisfaction with such programs begins to fade once they are left to deal with the limitations of delivered raw data.&lt;br /&gt;&lt;br /&gt;Technical Focus #1: Key (Technical) Performance Indicators&lt;br /&gt;This is the technical counterpart to a business focus on data usage, but it is not as effective overall. There can be an especially large number of KPIs for the business intelligence program in the area of ETL. These are analogous to the metrics you might place in the operational meta data -- up time, cycle end times, successful loads, clean data levels, etc. While important, they do not comprise the end game.&lt;br /&gt;&lt;br /&gt;Technical Focus #2: Adherence to a Guru Approach&lt;br /&gt;One of the ultimate disservices business intelligence teams can do is to spend their budget primarily making sure the architecture adheres to a book standard - as opposed to what works for the users.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-7759460834552754004?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/7759460834552754004/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=7759460834552754004' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/7759460834552754004'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/7759460834552754004'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/06/bi-from-both-sides-aligning-business.html' title='Business intelligence ROI and focus'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-3634479559074751243</id><published>2008-04-29T13:30:00.001-07:00</published><updated>2008-04-29T13:30:54.800-07:00</updated><title type='text'>Deadly sins of Data Warehouse</title><content type='html'>Data warehouse implementation is a formidable undertaking. Most of the experience you bring to the task won’t fit a data warehouse’s unique requirements and challenges. There are several things you might ordinarily do that you should steer clear of when working on a data warehouse.&lt;br /&gt;&lt;br /&gt;What follows is a list of things you can do to really screw things up. What you might find odd about this list is some of the items might not seem so bad. However, a data warehouse isn’t a transactional system; it conforms to no particular standard, does not implement a particular application, and is very organic in nature. In short, the data warehouse you are about to build is unique to the company that’s building it, and you may find yourself inventing new ways to do what it is that you do in order to make it happen. And the best way to find out what works is to maneuver around what won’t work. Here are some don’ts you should take seriously.&lt;br /&gt;&lt;br /&gt;1. DON'T write code you can't modify quickly&lt;br /&gt;The apps you’ll be accommodating will be analytical, not transactional. The users working with you to spec them out really don’t know exactly what they want—so you may go through several iterations before you get where you’re going. If you write well-constructed, flexible code, assuming it will change, your life will be easier. Write fly-by-wire code and you’ll regret it.&lt;br /&gt;&lt;br /&gt;2. DON'T use a database access API that won't allow modifications&lt;br /&gt;In the past, your database work has accommodated apps that accessed moderate amounts of data, for a potentially large pool of users. Now, you’re accommodating apps that will pull in huge amounts of data. You need to write code that will get the maximum amount of data with each read, and you’re not going to get that right the first time. So you need to choose tools that permit you to revise quickly and radically.&lt;br /&gt;&lt;br /&gt;3. DON'T design anything that isn't extendable&lt;br /&gt;Analytics aren’t really applications in the sense that Online Transaction Processing (OLTP) apps are. The point of analytics is to take large amounts of old data, find patterns in the data, and infer new information (intelligence) from the patterns. The code you write to access the underlying information may require stretches to include additional data, which will require joins. Don’t write code supporting analytics that fails to assume this will be the case!&lt;br /&gt;&lt;br /&gt;4. DON'T insinuate anything between the data and the user unnecessarily&lt;br /&gt;A warehouse needs to be exactly that. A user needs to be able to walk into the warehouse and pull information off the shelf. Because of the nature of business intelligence, analytics, and the metrics, your client's users are seeking to gauge performance. The user needs an environment that permits them to pick and choose data they wish to include in their analyses, whatever it might be. You can’t always accommodate this ideal, but you must do your best. Don’t add anything to their analytical apps that will make warehouse data access any harder!&lt;br /&gt;&lt;br /&gt;5. DON'T take shortcuts on data cleanup or source analysis&lt;br /&gt;The single biggest black hole you’ll encounter will be analyzing data sources for the Extract-Transform-Load mechanism, and the act of cleaning up data for loading. It is safe to assume the project manager will budget more than half of the total project resources for this phase alone. To be blunt, if you take shortcuts here, you will most certainly burn for it later. Don’t skimp on cleaning up dirty data, no matter how dull the work.&lt;br /&gt;&lt;br /&gt;6. DON'T avoid granularity and partitioning issues&lt;br /&gt;The two biggest data storage issues in warehouse design are settling transformed data at the proper level of granularity and partitioning data categorically. Why is this so important? Because the total warehouse volume shifts geometrically in response to granularity, and the efficiency of data access is directly proportional to the effectiveness of your partitioning. This is difficult grunt work, but it is critical. Don’t try to step around it.&lt;br /&gt;&lt;br /&gt;7. DON'T try to work OLAP without asking business questions&lt;br /&gt;Your client's users don’t really know what they want out of their apps until they see it. There will be lots of trial and error as they fish for the analytical result that will honestly deliver the performance metrics or forecasting that will make a difference in the way their department or the company does business. You don’t stand a chance of contributing to this process or catching their mistakes, if you don’t go beyond your role as IT accommodator and learn as much as possible about how their department (and, by extension, the company as a whole) functions. In conventional OLTP development, you can count on those around you to mind the business picture. In Online Analytical Processing (OLAP), everything is exploratory, and the people around you won’t necessarily catch mistakes that result from your misunderstanding. So, don’t assume you know more than you do. Ask the extra questions that will ensure you really do have a handle on the "business" in "business intelligenc&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-3634479559074751243?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/3634479559074751243/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=3634479559074751243' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3634479559074751243'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3634479559074751243'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/04/deadly-sins-of-data-warehouse.html' title='Deadly sins of Data Warehouse'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-4738618442860563056</id><published>2008-01-09T19:08:00.000-08:00</published><updated>2008-01-09T19:09:07.258-08:00</updated><title type='text'>10 Keys to a Successful Business Intelligence Strategy</title><content type='html'>October 22, 2007 — CIO — With all the mergers and acquisitions in the business intelligence (BI) space, it’s easy to forget that BI is about much more than the technology that’s behind it.&lt;br /&gt;You need to establish your vision for your business intelligence strategy before you bring technology into the conversation, says Boris Evelson, a Forrester Research analyst and lead author on the upcoming study “It’s Time to Reinvent Your BI Strategy.” Here’s how.&lt;br /&gt;1. Choose a C-level sponsor (who’s not the CIO). Business intelligence implementations should absolutely not be sponsored by anyone in IT, says Evelson. Instead, BI should be sponsored by an executive who has bottom-line responsibility; has a broad picture of the enterprise objectives, strategy and goals; and knows how to translate the company mission into key performance indicators that will support that mission. This executive is often the CFO. This sponsor should govern the implementation with a documented business case and be responsible for changes in scope.&lt;br /&gt;2. Create common definitions. Without common definitions, a BI implementation cannot succeed. And lack of agreement is a widespread problem in companies today. For example, finance and sales may define “gross margin” differently, which means that numbers will not match—in effect, negating the value of automation. To combat this problem, get subject matter expertise throughout lines of business from front-, middle- and back-office staff. At this stage, IT's participation should be limited to running the project management office and taking ownership of compliance and business standards and policies. Secondly, start small and choose only 10 to 20 key performance indicators and create standards and governance with them in mind.&lt;br /&gt;3. Assess the current situation. You should analyze the current business intelligence stack and processes and organizational structures surrounding current BI implementations. Both IT and the business should be involved. Evelson cautions against underestimating this phase, and points out that a full “BI diagnostic” from Accenture contains 1,500 questions against 325 best practices and 75 subject areas.&lt;br /&gt;4. Create a plan for data storage. Many organizations begin with an isolated data mart, since it’s quick and cheap, but consider that this tactic means additional silos will need to be created as additional data storage needs arise, which can grow out of control within a few years. Something else to consider is whether to build and maintain a physical data warehouse or go with the virtual, so-called “semantic” layers to link operational systems. Traditional data warehousing means duplicating data, which means bringing in operations systems in real time will be next to impossible. You can save space with an abstract definition layer, but this is difficult to design, as is any metadata repository. Before even considering which vendors to choose, you must resolve this issue.&lt;br /&gt;5. Understand what users need. The three broad classes of business intelligence users are strategic, tactical and operational. Strategic users make few decisions, but each one can have a profound effect—for example, should we close operations in Europe and open them in China. Tactical users make many decisions a week, and use both aggregate and detail-level information, and likely need updated information daily. Operational users are the front-line employees, such as call center staff. They need data within their own set of applications to execute the enormous numbers of transactions. Understanding who will use BI and for what purposes can show the type of information needed and its frequency, and help guide BI decision making.&lt;br /&gt;6. Decide whether to buy or build the analytical data model. One size does not fit all. In general you may benefit from an out-of-the-box, industry-specific data model if you have a more homogeneous IT environment—such as one ERP, one CRM system. Do watch for extensibility and hierarchy flexibility. More complex enterprises may benefit from customization, although you may still want to consider beginning with an industry-standard model as a template or a set of guides (such as typical facts, dimensions and so on).&lt;br /&gt;7. Consider all business intelligence components. Components that affect the success of business intelligence implementations include: metadata, data integration, data quality, data modeling, analytics, centralized metrics management, presentations (reports and dashboards), portals, collaboration, knowledge management and master data management. Be sure to define the architecture for all layers of the business intelligence stack; even though they may not be part of the BI strategy itself, they will effect the success of implementation.&lt;br /&gt;8. Choose a systems integrator. Business intelligence implementations require guidance from a partner who has deep experience. Evelson says be prepared to spend $5 to $7 on services for every $1 on software, and cautions: Do not outsource the fine-tuning of business intelligence. The process requires a high degree of collaboration among end users, analysts and developers.&lt;br /&gt;9. Think “actionable” and “baby steps.” Choose an end user, business analyst and developer to create a first proof of concept within a few days. Choose a few key performance indicators and build a few reports, then add new releases every few weeks.&lt;br /&gt;10. Choose low-hanging fruit to start. Evelson recommends choosing high-value, simple components to begin. For example, a sales analytics data mart may present high-value targets that also have plenty of existing models and best practices.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-4738618442860563056?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/4738618442860563056/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=4738618442860563056' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4738618442860563056'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4738618442860563056'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2008/01/10-keys-to-successful-business.html' title='10 Keys to a Successful Business Intelligence Strategy'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-5852558238019481287</id><published>2007-09-25T16:54:00.001-07:00</published><updated>2007-09-25T17:00:03.238-07:00</updated><title type='text'>Enterprise Data Architecture: Who Needs It?</title><content type='html'>If your goal is enterprise-wide data sharing, an enterprise data architecture (EDA) is what you need for revamped decision support. As we enter the 21st century, many organizations are exactly where they were 20 years ago. The technology infrastructure supporting them is full of complex interfaces creating impediments to rapid and lasting business change. Companies have consciously ignored staff warnings regarding building and maintaining this technological Tower of Babel. Building an EDA can give your staff an integrated view of enterprise data organization for strategic and tactical decision support. However, it's critical to understand the business value of an EDA and the factors that can make or break your EDA.&lt;br /&gt;&lt;br /&gt;A typical organization has a complex web of interrelated business systems and databases that impede change. An EDA enables organizational change because it organizes data around the enterprise's data subjects. Such organization permits multiple application systems to use the shared data resource. An EDA pulls together, validates, cleanses and integrates data from disparate source application systems, providing the end-user community with an integrated view of enterprise data. As a result, operational departments can access the data for strategic and tactical decision support, day-to-day operations and general reporting. &lt;br /&gt;&lt;br /&gt; Constructing Your Enterprise Data Architecture&lt;br /&gt;&lt;br /&gt;You wouldn't start building a skyscraper without first planning the entire building. The architect must meet with the customer to determine the structure's specifications, which are translated into a set of architectural blueprints providing steelworkers, carpenters, electricians and plumbers with the customer's vision. With the overall plan, separate teams can go to work with the knowledge that the building components will fit together.&lt;br /&gt;&lt;br /&gt;Similarly, the information technology (IT) world is full of inspired tradespeople (programmers, systems analysts, database designers and system designers) who crave the challenge and excitement of solving problems with their own unique approaches. They usually do an excellent job. However, the data and process objects they use overlap substantially.&lt;br /&gt;&lt;br /&gt;How do you create order from IT development and deployment chaos? You start with a well-conceived plan that details the tasks, milestones, resources, time and personnel required to design an enterprise business model. The next step is to create an enterprise data model and an enterprise process model. Finally, you identify the relationships among data and process objects to create the enterprise business model. Figure 1 shows typical data subject areas.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_o3SPD6mM7vA/RvmgjItqNOI/AAAAAAAAAXE/TsG9UocuC8s/s1600-h/enterprise_data.gif"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_o3SPD6mM7vA/RvmgjItqNOI/AAAAAAAAAXE/TsG9UocuC8s/s200/enterprise_data.gif" border="0" alt=""id="BLOGGER_PHOTO_ID_5114295377350767842" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Figure 1: Enteprise Data&lt;br /&gt;&lt;br /&gt;To implement the EDA specified in the enterprise business model, you create a plan that provides executives and middle managers with a concise understanding of the current IT state as compared to the enterprise model's vision. The plan drives and reports ongoing progress. In terms of an approach, increments are most effective for creating an EDA environment, ensuring that you're building only currently required components. The first EDA implementation will create the first subject area data objects deployed as part of the EDA environment. After initially deploying the environment, the team should focus on creating the next iteration.&lt;br /&gt;&lt;br /&gt;Each successive EDA environment iteration should benefit from reusing knowledge gained in previous iterations, experience and even currently deployed data objects. You should deploy each successive iteration to end users on a quarterly basis. In order to meet this schedule, the EDA team must ensure that the scope of the new data objects included in the environment is manageable within the quarter.&lt;br /&gt;Data Boundaries&lt;br /&gt;&lt;br /&gt;All matter is built on the foundation of a small number of atomic building blocks. If you can label and construct matter from a few atomic constructs, why is it that data processors can't construct and label such data based on a finite number of "core" data elements? Why is it that a typical IT shop's source code libraries contain hundreds or even thousands of data element names? Are all data molecules formed from a finite number of data atoms?&lt;br /&gt;&lt;br /&gt;Fortunately, you can reduce these bloated amounts of data element names. The majority of names are homonyms, synonyms and aliases created in the absence of three factors: naming standards, an EDA perspective and planning for data creation and use. Many programmers, analysts and database specialists create data element names and label their own data without regard for the way others in the enterprise do so.&lt;br /&gt;&lt;br /&gt;Using the EDA construct, your organization can design and implement a solid foundation of core data elements and entities from which you can derive reusable, accurate, integrated and reliable information.&lt;br /&gt;Executing Your Enterprise Data Architecture&lt;br /&gt;&lt;br /&gt;It's critical that you pay attention to five primary project factors to ensure EDA success:&lt;br /&gt;&lt;br /&gt;    * Direct participation by top management. To implement an integrated, cross-functional EDA, it's imperative to gain corporate top management commitment. In most organizations, data integration projects face a great deal of resistance within and outside the IT department. Data processors are a creative lot by nature who enjoy the challenge of unraveling the nightmare world of data interfaces. In addition, end users want exclusive rights over their own data. Top management must motivate both groups to jump aboard the EDA initiative. Additionally, top management must participate directly in the initiative's planning and quality view. Without this commitment level, you might encounter political roadblocks.&lt;br /&gt;    * Well-defined scope. The EDA team must have the time and energy to learn about building an EDA without being encumbered by in-house politics. In other words, the team must learn to walk before it runs. First time, large-scale EDA projects have a tendency to fail due to political and organizational conflicts.&lt;br /&gt;    * Design stability. Stability doesn't imply that the EDA will never change; you can make most changes without requiring anyone to rewrite application systems. The EDA's underlying logical data model should produce physical tables independent of their physical implementation on current hardware and systems software. As the underlying technology changes over time, the EDA's logical data structures will remain valid.&lt;br /&gt;    * Data object abstraction and generalization. You must design the EDA's logical data model with an appropriate data normalization level; as a result, you'll be able to add entities and attributes to the model without rewriting existing applications.&lt;br /&gt;    * A data modeling team with one or two members. Integrating data across business areas represents the EDA's primary value. Many users think that the time it takes to integrate the data into a coherent data model results in additional delay for little return. This perception is erroneous. The efforts of one or two data modelers over the span of a few months will result in a coherent model. Enterprise-level data modeling is not a multithreading discipline. &lt;br /&gt;&lt;br /&gt;Sharing Data&lt;br /&gt;&lt;br /&gt;Data administrators typically have difficulty convincing management that information is an asset of the entire organization -- not the private property of individual operating units exerting tremendous political pressure on the IT staff to build something at the expense of enterprise needs. It's imperative that you draft an enterprise data architecture strategy document that's tied directly to your organization's business strategy.&lt;br /&gt;&lt;br /&gt;When you've conquered the politics of enterprise data modeling, created a robust enterprise data model and populated your enterprise-level databases, the DBMS (database management system) software provides the capabilities necessary to share data across the enterprise. The DBMS not only ensures that there will be data integrity, security, concurrency and high enterprise-level database availability, but it also provides fault-tolerant backup and recovery, performance monitoring and system management capabilities.&lt;br /&gt;&lt;br /&gt;Data management theorists and practitioners have aimed for enterprise-wide data sharing for more than 25 years. With the enterprise-level database construct, we've taken bold steps in our pursuit of this goal. As shown in Figure 2, the enterprise-level database should evolve into the physical instantiation of the enterprise data model.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_o3SPD6mM7vA/RvmgvotqNPI/AAAAAAAAAXM/aCSV1buAe6s/s1600-h/enterprise_data1.gif"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_o3SPD6mM7vA/RvmgvotqNPI/AAAAAAAAAXM/aCSV1buAe6s/s200/enterprise_data1.gif" border="0" alt=""id="BLOGGER_PHOTO_ID_5114295592099132658" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Figure 2: Enterprise Data Architecture&lt;br /&gt;Migrating to an Enterprise Data Architecture&lt;br /&gt;&lt;br /&gt;You can't implement an EDA in a single release, nor will all existing legacy systems migrate to it at the same time. You should implement your EDA in stages. However, you must factor a few issues into choosing the optimal migration strategy. There is your return on investment, for one thing. The migration strategy should follow a pay-as-you-go philosophy and provide tangible benefits to the organization as quickly as possible. The strategy should minimize both business and technical risks. Avoiding unproven new technology, for example, ensures that your basic architectural core and concepts will be sound.&lt;br /&gt;&lt;br /&gt;Migration to an EDA from existing application-specific databases shouldn't be an end in itself. Rather, migration should take place as part of two initiatives:&lt;br /&gt;&lt;br /&gt;    * Increments that follow operational systems portfolio restructuring, implementing new cross-departmental or cross-functional application systems.&lt;br /&gt;    * Increments that respond to departmental and enterprise data requirements simultaneously. &lt;br /&gt;&lt;br /&gt;You must carefully evaluate the degree of cultural change on the IT staff, the user community and your organization's customer base. You shouldn't let the rate of change be excessively rapid or drastic. Implementing an EDA should be evolutionary -- not revolutionary.&lt;br /&gt;Enterprise Data Architecture Business Value&lt;br /&gt;&lt;br /&gt;It's difficult to dispute the value of having subject-oriented data at the corporate level, giving you a common view and understanding of data regardless of the business function. A customer is a customer, regardless of who's asking the question. This shared data environment offers more accurate and complete information for improved decision making. An EDA offers several benefits that will add business value to your company. The architecture will:&lt;br /&gt;&lt;br /&gt;    * Provide an understanding of the fundamental health of the enterprise. Many Fortune 1000 companies can't accurately determine how many customers they have. They also can't figure out whether they're making or losing money on any one product or service. Normally, profit and loss metrics are accurate only at the profit center level. Rolling up these figures to the corporate level is impossible for most organizations. Enterprises that have built an EDA find themselves able to compete more successfully than their competitors.&lt;br /&gt;    * Speed application system development. As you add more enterprise data subjects to the EDA, the application development rate increases because the data already exists.&lt;br /&gt;    * Provide the tools and processes required for IT to effectively manage the complexities of your organization's most precious asset, its information.&lt;br /&gt;    * Set the foundation for sharing data across the enterprise in a controlled and managed environment.&lt;br /&gt;    * Provide a consistent enterprise-wide view to the end user ensuring data integrity and reliability across business systems at a global level.&lt;br /&gt;    * Increase the ability of your knowledge-workers to transform data into information.&lt;br /&gt;    * Provide a standard framework for addressing data issues and data design decisions in order to develop durable business solutions.&lt;br /&gt;    * Enable your organization to have a high degree of control over the replication and duplication of data.&lt;br /&gt;    * Enable your business units to process information in a more cost-effective fashion. &lt;br /&gt;&lt;br /&gt;An Enterprise Data Architecture Pays for Itself&lt;br /&gt;&lt;br /&gt;Our preoccupation with unraveling the decades of "silo" application system and data store design shouldn't surprise anyone. Organizations have dug themselves into a deep "data hole" that can only be filled by designing and implementing an EDA. An EDA offers a significant return on investment primarily because it supports creating reusable data. You build data once, and you can share it many times across many application systems.&lt;br /&gt;&lt;br /&gt;An EDA is integral to solving the data integrity problems many organizations face. From an IT perspective, managing many application systems and redundant data stores is a difficult and complex endeavor. From a business perspective, identical queries often yield different answers because each functional area has its own data.&lt;br /&gt;&lt;br /&gt;Unfortunately, most people want immediate gratification. It's worse in the business world because at the end of every quarter, you must perform for the bottom line. Increasing flexibility and reducing time to market won't happen accidentally through technology acquisitions, software packages or custom developed systems. It will only happen if you invest in the design and implementation of an EDA.&lt;br /&gt;A Win/Win Data Solution&lt;br /&gt;&lt;br /&gt;Most organizations are finding that corporate survival requires shifting course. By creating an EDA based on your organization's fundamental data building blocks, you'll provide a foundation on which to respond rapidly to the changing demands placed on your business in this millennium.&lt;br /&gt;&lt;br /&gt;The shift from building departmental solutions to building enterprise solutions will require major cultural changes within the enterprise's business and IT departments. To compete and win in this century, you must discontinue building business systems and data stores that reflect management organization. Instead, focus on future business systems and data stores that reflect the true needs inherent in delivering products and services to the customer.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-5852558238019481287?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/5852558238019481287/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=5852558238019481287' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5852558238019481287'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5852558238019481287'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/09/enterprise-data-architecture-who-needs.html' title='Enterprise Data Architecture: Who Needs It?'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_o3SPD6mM7vA/RvmgjItqNOI/AAAAAAAAAXE/TsG9UocuC8s/s72-c/enterprise_data.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-9127700482812250453</id><published>2007-09-25T16:47:00.000-07:00</published><updated>2007-09-25T16:49:39.909-07:00</updated><title type='text'>The Trial-and-Error Method for Data Architecture</title><content type='html'>Three of the common mistakes beginners make in architecting data include:&lt;br /&gt;&lt;br /&gt;1. Letting enterprise applications inspire the architecture. BI/DW data architecture design, specifically DWs and data marts, should not be designed like the enterprise applications used as the source systems. But people design what they know.&lt;br /&gt;&lt;br /&gt;That may be a quick way to design the DW, but it puts the emphasis on operational processing rather than BI. This can make the downstream data marts or reports add complex and time-consuming processing in order for the data to be consumable. And if there is only a DW, it is harder to build and deploy BI applications.&lt;br /&gt;&lt;br /&gt;Many longstanding DWs have followed this approach. If you hear that the DW is too complex, makes it difficult to create BI queries and processes queries slowly, then the BI/DW team may have based its design on the enterprise application. In their defense, they may not have even realized it.&lt;br /&gt;&lt;br /&gt;This approach was not designed from the business user's BI and performance management requirements but rather from the transactional processing requirements of the source systems. It doesn't take into account many of the best practices and methods that can be used to provide a truly significant business ROI.&lt;br /&gt;&lt;br /&gt;Enterprise application vendors generally fall into this design trap when they initially build their DW, BI or corporate performance management (CPM) solutions because it reflects their frame of reference, but it can be just as damaging to your BI and CPM solutions.&lt;br /&gt;&lt;br /&gt;2. Engaging in DW schema wars. The team designing the DW may fall into the "religious" war between having everything in third normal form (3NF) or in a dimensional model. As with the first challenge, the basic concern is that the design needs to be created based on business needs rather than an esoteric data modeling concept. Most DW environments need data stored in both 3NF and dimensional models, not either/or. The impact shifts data integration logic to the BI application, thus increasing time and costs to develop the BI solution.&lt;br /&gt;&lt;br /&gt;The flip side is that the DW team decides everything should be stored in a dimensional model and not in a normalized manner. Although this is great for BI reports and analysis - hence, perfect for a data mart - if your DW needs to support history, has changing dimensions and you need to implement data integrity and quality, then the normalized form is the best practice under these conditions.&lt;br /&gt;&lt;br /&gt;This mixed environment is really a DW utilizing a normalized form to store data historically and manage change data capture (CDC) with a data mart in a dimensional model to enable BI reporting and analysis. Nowadays these two approaches can be implemented within the same database using different logical areas, such as a schema in an Oracle environment.&lt;br /&gt;&lt;br /&gt;3. Snubbing summary tables. I love it when someone tells me that they don't need a data mart or summary tables, relying on their DW tables and the associated database technology to provide all the performance they need. They wonder why they need to store the data again when they bought the best database and a terrific hardware platform with a lot of memory and fast disk arrays.&lt;br /&gt;&lt;br /&gt;The first thing I do is check out the BI reports and analysis. The data has to be summarized and aggregated to start most analysis. If it is not done in the DW/data mart, then the BI report code is going to do it, making it much more costly to develop and maintain.&lt;br /&gt;&lt;br /&gt;In addition, you'll also see some of these reports creating temporary summary tables to improve performance.  Instead of doing it once for everyone in a data mart, each BI report does it over again. Not only is this time-consuming, but also, every time someone different does it, the more likely it is that the numbers are going to differ.&lt;br /&gt;&lt;br /&gt;Even if there are no summary tables, there may be BI cubes or, even worse, data shadow systems built by the business users to make up for this shortcoming in the data architecture.&lt;br /&gt;&lt;br /&gt;A little patience goes a long way when a business is designing its data architecture. In future columns I'll discuss some of the pitfalls in data integration, BI, project management, production and ongoing support.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-9127700482812250453?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/9127700482812250453/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=9127700482812250453' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/9127700482812250453'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/9127700482812250453'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/09/trial-and-error-method-for-data.html' title='The Trial-and-Error Method for Data Architecture'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-4071502060389392728</id><published>2007-09-21T18:57:00.000-07:00</published><updated>2007-09-21T19:04:41.986-07:00</updated><title type='text'>Open Source BI</title><content type='html'>Jaspersoft&lt;br /&gt;&lt;br /&gt;Jaspersoft reports is one of the key Open Source rivals to BIRT and is used in a lot of Open Source projects - Bizgres lead by GreenPlum is an example. The reporting tool actually proceeded BIRT and set a lot of trends in Java report writing including output in several formats including HTML, XML, PDF, CVS, etc.&lt;br /&gt;&lt;br /&gt;But yet again I was frustrated by the install process. I am not an Ant expert - but not a neophyte either. But I could not get the Ant Build script to work for Jaspersoft Reports no matter what or where I tried - including Eclipse's excellent Ant Engine. So after the allotted 3 turned to 4 hours I again threw in the towel. But I shall return, because of the next software tried.&lt;br /&gt;&lt;br /&gt;Pentaho&lt;br /&gt;&lt;br /&gt;Pentaho can be thought of as an integrator of BI software including BIRT, Jaspersoft and JFreeReport charting plus reporting tools; Mondrian OLAP and jPivot analysis engines; Weka data mining; Firebird and Shark for underlying database and workflow; plus Kettle Extract Transform Load tools among others. Pentaho provides the scripting and overall integration services along with the integration design. This is not insignificant, because users starting from scratch would have a formidable task to do the equivalent.&lt;br /&gt;&lt;br /&gt;Yet the complete system is all delivered as Open Source and we had the best success&lt;br /&gt;getting some of the systems like BIRT and JasperSoft Reports to work in Pentaho. However, even with Pentaho the demo system and database took enough time and effort to get going that reluctantly a cutoff at 5 1/2 hours in had to be imposed. Again, we know better where to go to get the data and documentation to make it all work - but that awaits another day.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-4071502060389392728?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/4071502060389392728/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=4071502060389392728' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4071502060389392728'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4071502060389392728'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/09/open-source-bi.html' title='Open Source BI'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-1395210251862187491</id><published>2007-05-31T05:10:00.000-07:00</published><updated>2007-05-31T05:11:01.125-07:00</updated><title type='text'>How to design if many to many relationship exists between dimension and fact table</title><content type='html'>Overview&lt;br /&gt;While designing a data mart/ warehouse sometimes you get cases where you find many-to-many relationship exists between dimension and fact table. Usually you have one-to-many relationship between dimension and fact table which is best for good OLAP/ cubes. In this article I am focusing on such M2M cases and how to design to resolve these cases.&lt;br /&gt;&lt;br /&gt;Design&lt;br /&gt;Let�s consider a case. We have a time dimension, which have regular attributes, like year, month, date, day, WOY, DOW etc. And we have a fact table which have following date columns, schedule date, shipped date, order date and promised date.&lt;br /&gt;As per the regular design we can either store these dates as it is or can store particular time_id from a Time dimension. If we store a time dimension id (surrogate key) we will create many to many relationship. If we don�t store ids we will not be able to use these columns easily for data analysis.&lt;br /&gt;To resolve this we can use a "Role Based Dimension" concept.&lt;br /&gt;1. Create following views "schedule_date_vw", "shipped_date_vw" and "promised_date_vw" using the SELECT * FROM dim_time. This will create different roles of time dimension&lt;br /&gt;2. Now use these views as dimensions in your schema&lt;br /&gt;3. Add dim_time dimension table in the design&lt;br /&gt;4. Use surrogate keys from the above created views/ dimension along with dim_time and store it in the fact table&lt;br /&gt;&lt;br /&gt;Conclusion&lt;br /&gt;This will resolve a problem of many to many relationship and can easily use in OLAP and cubes for better data analysis.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-1395210251862187491?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/1395210251862187491/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=1395210251862187491' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/1395210251862187491'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/1395210251862187491'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/how-to-design-if-many-to-many.html' title='How to design if many to many relationship exists between dimension and fact table'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-6567864382808379676</id><published>2007-05-31T05:08:00.001-07:00</published><updated>2007-05-31T05:08:29.750-07:00</updated><title type='text'>How to design fact table for multicurrency column</title><content type='html'>Most of the time we have companies doing business in more than one country. If you have KPIs which demands to do a comparison in original currency you will need to have a provision in your design. In this article I am focusing on this case.&lt;br /&gt;&lt;br /&gt;Design&lt;br /&gt;There are two ways you can design this.&lt;br /&gt;1. Add a column which will hold the conversion rate of that day/ date. So your fact table will have following columns: local currency amount column, conversion rate, and can also use USD amount column which will hold the converted amount. This will solve the purpose of recording the conversion rate. However if you want to slice and dice the data based on the different currencies and different point in time, you can design using the next approach.&lt;br /&gt;2. Create a Currency Dimension which will hold the conversion rates for every day. Now in fact table store the respective surrogate key along with other amount column. Now you can design an analysis cube to use this dimension.&lt;br /&gt;&lt;br /&gt;Conclusion&lt;br /&gt;This way you can achieve maximum performance for your queries or reports or cubes to analyze the data with respect to a daily changing rate.&lt;br /&gt;&lt;br /&gt;Posted by Milind Zodge at January 27, 2007 10:45 PM&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-6567864382808379676?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/6567864382808379676/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=6567864382808379676' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6567864382808379676'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6567864382808379676'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/how-to-design-fact-table-for.html' title='How to design fact table for multicurrency column'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-2765883401790476783</id><published>2007-05-31T05:07:00.001-07:00</published><updated>2007-05-31T05:07:37.295-07:00</updated><title type='text'>What is metrics and what are the different types of metrics</title><content type='html'>Most of the time we have companies doing business in more than one country. If you have KPIs which demands to do a comparison in original currency you will need to have a provision in your design. In this article I am focusing on this case.&lt;br /&gt;&lt;br /&gt;Design&lt;br /&gt;There are two ways you can design this.&lt;br /&gt;1. Add a column which will hold the conversion rate of that day/ date. So your fact table will have following columns: local currency amount column, conversion rate, and can also use USD amount column which will hold the converted amount. This will solve the purpose of recording the conversion rate. However if you want to slice and dice the data based on the different currencies and different point in time, you can design using the next approach.&lt;br /&gt;2. Create a Currency Dimension which will hold the conversion rates for every day. Now in fact table store the respective surrogate key along with other amount column. Now you can design an analysis cube to use this dimension.&lt;br /&gt;&lt;br /&gt;Conclusion&lt;br /&gt;This way you can achieve maximum performance for your queries or reports or cubes to analyze the data with respect to a daily changing rate.&lt;br /&gt;&lt;br /&gt;Posted by Milind Zodge at January 27, 2007 10:45 PM&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-2765883401790476783?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/2765883401790476783/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=2765883401790476783' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/2765883401790476783'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/2765883401790476783'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/what-is-metrics-and-what-are-different_2175.html' title='What is metrics and what are the different types of metrics'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-4613614038284935427</id><published>2007-05-14T10:15:00.000-07:00</published><updated>2007-05-14T10:25:56.224-07:00</updated><title type='text'>How do we know if a software salesperson is ****  - Expectation commited by a vendor</title><content type='html'>Advice: &lt;br /&gt;&lt;br /&gt;Understand that when dealing with any time of vendor, first they want to make money.. and that's OK.&lt;br /&gt;&lt;br /&gt;It's really important to put our full expectations. For performance and SLA's&lt;br /&gt;&lt;br /&gt;My Trick:&lt;br /&gt;Get it in writing and update the service level aggreement Eg: performance expectations are and add to the contract.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-4613614038284935427?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/4613614038284935427/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=4613614038284935427' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4613614038284935427'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4613614038284935427'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/how-do-we-know-if-software-salesperson.html' title='How do we know if a software salesperson is ****  - Expectation commited by a vendor'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-7770297931976789694</id><published>2007-05-14T10:13:00.000-07:00</published><updated>2007-05-14T10:14:24.450-07:00</updated><title type='text'>For every rule there is an exception; for each exception there are more exceptions…</title><content type='html'>To implement an ETL process there are many steps that are followed. One such step is creating a mapping document. This mapping document describes the data mapping between the source systems and the target and the rules of data transformation.&lt;br /&gt;&lt;br /&gt;Ex. Table / column map between source and target, rules to identify unique rows, not null attributes, unique values, and range of a attributes, transformations rules, etc.&lt;br /&gt;&lt;br /&gt; &lt;br /&gt;Without going into further details of the document, lets analyze the very next step. It seems obvious and natural to start development of the of the ETL process. The ETL developer is all fired up and comes up with a design document and starts developing, few days time the code is ready for data loading.&lt;br /&gt;&lt;br /&gt; &lt;br /&gt;But unexpectedly (?) the code starts having issues every few days. Issues are found and fixed. And then it fails again. What’s happening? Analysis was done properly; rules were chalked out &amp; implemented according to the mapping document. But why are issues popping up?  Was something missed?&lt;br /&gt;Maybe not! Isn’t it, normal to have more issues in the initial lifetime of the processes?&lt;br /&gt; &lt;br /&gt;Maybe Yes! You have surely missed ‘Source System Data Profiling’. The business analyst has told you rules as the how the data is structured in the source system and how it is supposed to behave; but he/she has not told you the ‘buts and ifs’ called as EXCEPTIONS for those rules.&lt;br /&gt; &lt;br /&gt;To be realistic it is not possible for anyone to just read you all rules and exceptions like a parrot. You have to collaborate and dig the truth. The actual choice is yours, to do data profiling on the source system and try to break all the rules told by the analyst. Or you can choose to wait for the process to go live and then wakeup every night as the load fails.  If you are lucky enough you deal with an unhappy user every morning you go to the office. &lt;br /&gt; &lt;br /&gt;Make the right choice; don’t miss ‘Source system data profiling’ before actually righting a single line of code. Question every rule. Try to find exception to the rules. There must be at least 20 tables. One table on an average will have 30 columns; each column will have on an average 100k values. If you make matrix of number of tables * columns * data values, it will give the number of reasons the why your assumptions may be wrong.   It’s like unit testing source data even without loading. There is a reason why machines alone cannot do your job; there is reason why IT jobs are more paying.&lt;br /&gt; &lt;br /&gt;Remember, ‘for every rule there is an exception; for each exception there are more exceptions…’&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-7770297931976789694?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/7770297931976789694/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=7770297931976789694' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/7770297931976789694'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/7770297931976789694'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/for-every-rule-there-is-exception-for.html' title='For every rule there is an exception; for each exception there are more exceptions…'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-1959445435174657992</id><published>2007-05-02T11:40:00.000-07:00</published><updated>2007-05-02T11:41:27.470-07:00</updated><title type='text'>Data Staging Strategy considerations</title><content type='html'>Overview&lt;br /&gt;Whenever we start designing a Data Mart/ Data Warehouse environment first thing comes is staging area. In this article I am focusing on what different types of Data Staging Techniques are there and how to define a Data Staging Strategy.&lt;br /&gt;&lt;br /&gt;Detail&lt;br /&gt;While defining the strategy you will have to focus on&lt;br /&gt;What technique you will be using?&lt;br /&gt;What kind of data load it will be, full data or incremental?&lt;br /&gt;Where the staging data will reside?&lt;br /&gt;Where should aggregation be performed?&lt;br /&gt;&lt;br /&gt;There are following well-known techniques are available:&lt;br /&gt;1. Store and Forward: In this technique, a data is stored in staging area and then used for transformation and loading into Data Warehouse environment&lt;br /&gt;2. Direct Database insert/update: In this technique, a data is directly read from ODS and directly will be inserted or updated in the Data Warehouse environment&lt;br /&gt;&lt;br /&gt;There are following Data Load types:&lt;br /&gt;1. Full Data: Here you use all the rows and update Data Warehouse environment with the data. This is time consuming process and the processing time will gradually increase because of data growth rate&lt;br /&gt;2. Delta or Incremental: Here you only get the changed/new records and you process these records so that information is passed to Data Warehouse environment&lt;br /&gt;&lt;br /&gt;Types Staging Data Stores:&lt;br /&gt;1. File: Data can be placed in File. If more sorting operation needs to be performed then storing data in this format is beneficial&lt;br /&gt;2. Table: Staging data can be stored in the table either permanently or for some time till it gets published to the Data Warehouse area&lt;br /&gt;&lt;br /&gt;Where to perform aggregation: If the aggregation is required by the Data Warehouse process. It can be either performed while loading the stage area or loading Data Warehouse area. Decide where you want to perform this.&lt;br /&gt;&lt;br /&gt;Conclusion&lt;br /&gt;Thus considering all the sides you can prepare a good Data Staging Strategy.&lt;br /&gt;&lt;br /&gt;Please refer to "Data Strategy" book by Sid, Larissa and Majid for more details.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-1959445435174657992?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/1959445435174657992/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=1959445435174657992' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/1959445435174657992'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/1959445435174657992'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/data-staging-strategy-considerations.html' title='Data Staging Strategy considerations'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-8716912888866631931</id><published>2007-05-02T11:38:00.001-07:00</published><updated>2007-05-02T11:38:20.867-07:00</updated><title type='text'>How to use Oracle's Metadata package for impact analysis</title><content type='html'>Overview&lt;br /&gt;Business is always changing and you have to make some changes based on the business change.&lt;br /&gt;&lt;br /&gt;Before doing any change you want to perform an impact analysis. Most of the data modeling tools have provision to do it. I am focusing in the article how you perform this task if you don't have a tool.&lt;br /&gt;&lt;br /&gt;Details&lt;br /&gt;Consider a case, we have Oracle database and wants to alter a column width and would like to see wherever this column is used/ referenced.&lt;br /&gt;&lt;br /&gt;We can use Oracle's metadata package as indicated below&lt;br /&gt;&lt;br /&gt;SET pagesize 0&lt;br /&gt;SET long 90000&lt;br /&gt;SET feedback off&lt;br /&gt;SET echo off&lt;br /&gt;&lt;br /&gt;SELECT DBMS_METADATA.GET_DDL('TABLE',ut.table_name)&lt;br /&gt;FROM USER_TABLES ut;&lt;br /&gt;&lt;br /&gt;This will give DDL scripts for all the tables. Now you can use any text tool like Notepad to search for the required column and find out the references.&lt;br /&gt;&lt;br /&gt;Conclusion&lt;br /&gt;There are various ways to do it. This is one of them. This will help you determining the impact exposure.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-8716912888866631931?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/8716912888866631931/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=8716912888866631931' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/8716912888866631931'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/8716912888866631931'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/how-to-use-oracles-metadata-package-for.html' title='How to use Oracle&apos;s Metadata package for impact analysis'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-3657879717694424685</id><published>2007-05-02T10:00:00.001-07:00</published><updated>2007-05-02T10:00:59.362-07:00</updated><title type='text'>What is metrics and what are the different types of metrics</title><content type='html'>Overview&lt;br /&gt;Any BI application's main role is to show information based on some measurements. These measurements are metrics. E.g. you measure how much sale you have done, so total sales revenue is your metrics. In this article I am focusing on the types of metrics and when and how to use a proper one in the application.&lt;br /&gt;&lt;br /&gt;Details&lt;br /&gt;There are three main types of metrics you can use in your application:&lt;br /&gt;&lt;br /&gt;1.Leading Indicators: If you want to measure activities like how many touches are required to convert a prospect into a customer; leading indicator metrics are used, which measures activities. Generally these indicators will show how many calls/activities you need to do to achieve you goal.&lt;br /&gt;&lt;br /&gt;2.Lagging Indicators: If you want to measure any business financial amounts like sales revenue; lagging indicators are used, which measures outcome of the activities. Generally these indicators will show where you stand currently.&lt;br /&gt;&lt;br /&gt;3.Key Performance Indicators (KPI): If you want to see how is your performance and where you stand, is it good or bad then Key Performance Indicators are used which measure the performance. E.g. If you want to see how is sales revenue with respect to sales quota&lt;br /&gt;&lt;br /&gt;Conclusion&lt;br /&gt;A proper metrics is used based on which application you are designing for. If it is a BAM-Business Activity Monitoring then Lagging Indicators or KPI will convey the information.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-3657879717694424685?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/3657879717694424685/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=3657879717694424685' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3657879717694424685'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/3657879717694424685'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/05/what-is-metrics-and-what-are-different.html' title='What is metrics and what are the different types of metrics'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-5253509382207162859</id><published>2007-04-20T13:17:00.000-07:00</published><updated>2007-04-20T13:40:58.938-07:00</updated><title type='text'>Mistakes made by data warehouse project managers</title><content type='html'>FOREWORD&lt;br /&gt;&lt;br /&gt;Most organizations treat project management as an administrative function. A project manager often “manages” multiple projects. However, a more accurate way to define a project manager would be to say that he or she “administers” multiple projects because he/she is rarely involved in any daily project activities. The project teams merely report to him/her.&lt;br /&gt;&lt;br /&gt;Project managers, assuming that data warehouse projects are like any other project, are often surprised when their data warehouse project spins out of control. The requirements appear to be a “moving target;” the schedule keeps slipping; the source data is much dirtier than expected and is impacting the ETL team; the staff does not have the necessary skills and is not properly trained; communication between staff members takes too long; traditional roles and responsibilities, and how they are assigned, seem to result in too much rework; the traditional methodology does not seem to work; and so on.&lt;br /&gt;&lt;br /&gt;Techniques that work on other projects do not work well on data warehouse projects. This booklet describes how to avoid 10 common mistakes made by data warehouse project managers.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;ABOUT THE AUTHOR&lt;br /&gt;&lt;br /&gt;Larissa Moss, president of Method Focus Inc., specializes in data warehousing, business intelligence, information quality, data integration, project management, and spiral data warehouse methodologies. She presents and lectures at various conferences worldwide. She co-authored the books Data Warehouse Project Management, Impossible Data Warehouse Situations, Business Intelligence Roadmap, and Data Strategy. Her works have been published in trade journals, including DM Review, Teradata Magazine, and TDWI’s Business Intelligence Journal. Additionally, her white papers are available through the Cutter Consortium and NCR/Teradata. She can be reached at methodfocus@earthlink.net.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;1. Failing to Use a Methodology&lt;br /&gt;&lt;br /&gt;Software development has become relatively lax over the past two decades, and the use of system development methodologies has become more of an exception rather than a rule. Project teams—as well as business users—seem to think that with all the new development tools available, system development is (or should be) trivial. They are often surprised to learn that project managers and project teams must consider approximately 920 tasks when developing a data warehouse. Who can remember 920 tasks? No one. But every one can look up 920 tasks in a methodology.&lt;br /&gt;&lt;br /&gt;Having the right kind of methodology is important. It cannot be a traditional “waterfall” methodology because that type of methodology assumes you are building a stand-alone “final” product, which does not have to integrate with any other product and will not dramatically evolve or expand over time. Thus, a traditional methodology does not include cross-organizational business integration tasks. Since a data warehouse is an evolving environment with many databases and applications, it is important to design databases and processes for reuse whenever possible. This requires specific integration tasks that a data warehouse methodology must provide.&lt;br /&gt;&lt;br /&gt;In addition, a data warehouse methodology must take into account that a data warehouse environment cannot be built all at once. In other words, the deliverable will not be a stand-alone “final” product, but will have to be expanded and enhanced over time. If a data warehouse is successful, then each release will most likely generate new requirements. Sometimes these requirements will be for a brand new data warehouse application, but many times they are simply an enhancement of an existing application. Periodically, these new requirements may even demand that new technology be evaluated and purchased. A data warehouse methodology provides appropriate tasks for all of these activities.&lt;br /&gt;&lt;br /&gt;Another differentiating aspect of data warehouse projects is that you have to manage multiple sub-projects in parallel. One such sub-project is the development of the data warehouse application (e.g., reports, canned queries, or customized cubes for slicing and dicing). Then there is the ETL process, including data profiling, data transformations, and data cleansing in addition to source data extracting and target data loading. A third sub-project may be building and loading the metadata repository. And there may even be a data mining deliverable, requiring its own development track.&lt;br /&gt;&lt;br /&gt;A data warehouse methodology includes tasks for all of these development activities, and recognizes that many of these activities can run simultaneously.&lt;br /&gt;&lt;br /&gt;Since metadata is an important deliverable, it deserves special mention when discussing methodologies. Not only does a data warehouse methodology have to include tasks for gathering, storing, and delivering metadata to the business community, it must also provide tasks for either evaluating and installing a purchased metadata repository product, or designing and building one.&lt;br /&gt;&lt;br /&gt;In an evolving and expanding data warehouse environment, where maximum reusability must be built into all deliverables, it is important to continuously review and improve the environment. That means reviewing old and new requirements against existing data warehouse databases and applications, and finding ways to reuse what has already been built. Such reviews may result in requirements for minor database design changes, or program changes to the ETL process, reports, queries, or other applications. The methodology must provide tasks for conducting such reviews and folding the resulting changes into the next data warehouse project.&lt;br /&gt;&lt;br /&gt;Taking the many development steps into account (from business case assessment to post-implementation review) and considering that most data warehouse projects are composed of several sub-projects, it is easy to understand that there are hundreds of tasks to be considered. Naturally, not all tasks have to be executed on each project, but all tasks must be known to the project manager so that he/she can pick the right ones for each data warehouse release.&lt;br /&gt;&lt;br /&gt;The role of a methodology is to provide a list of all possible tasks, their dependencies, the roles and responsibilities assigned to execute them, and the deliverables resulting from them. Not using a methodology almost guarantees that vital tasks will be dropped, requiring rework that could have been avoided.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;2. Ineffective Project Team Structure&lt;br /&gt;&lt;br /&gt;Traditional project teams are not structured to cope effectively with the dynamic nature of data warehouse projects and cannot react fast enough to the constant changes and challenges. What’s a traditional project team structure? Typically, the project manager alone defines and plans the project, and assigns a discrete set of tasks to each project team member. When a team member completes a task, the deliverable gets “handed over the cubicle wall” to the next team member who performs his/her assigned tasks and hands over the deliverable to the next person, and so on. Then on Friday afternoon, all team members submit a status report of their individually assigned tasks to the project manager who uses these reports to monitor and control the project activities. Occasionally, or regularly, team meetings are called to exchange information, and when a problem arises, special meetings are arranged with the business people or other stakeholders who can help resolve the issue.&lt;br /&gt;&lt;br /&gt;A data warehouse team must be much more flexible and dynamic than that. There should be a core team of four to five people who together define, plan, and co-lead the project. The core team should be thought of as a high-powered, self-organizing SWAT team. Core team members must be 100 percent available from the beginning to the end of the data warehouse project. They brainstorm together, assign work to each other, review each other’s deliverables (peer reviews), resolve issues, and make project-related decisions together. This team should be staffed with senior-level team members who are experts in:&lt;br /&gt;&lt;br /&gt;    * Project management (a lead person, not an administrator)&lt;br /&gt;    * Subject matter expertise (a business representative, not an&lt;br /&gt;      IT person) (1)&lt;br /&gt;    * Business analysis practices (data modeling and process modeling)&lt;br /&gt;    * System analysis techniques (light programming)&lt;br /&gt;    * Programming (ETL, OLAP, report writers, metadata repository, etc.)&lt;br /&gt;&lt;br /&gt;Each person on the core team can be, and probably will be, assigned multiple roles. The core team roles and their main responsibilities are listed in the following table.&lt;br /&gt;&lt;br /&gt; &lt;br /&gt;Core Team Role  Major Responsibilities&lt;br /&gt;Application Lead Developer  Design and oversee the development&lt;br /&gt;of the data warehouse access and analysis applications.&lt;br /&gt;Business Representative (2)  Make business decisions, resolve disputes between&lt;br /&gt;business units, and improve&lt;br /&gt;the source data quality.&lt;br /&gt;Data Administrator  Perform cross-organizational data analysis, establish naming standards, create the project-specific logical data models and merge those models into an enterprise logical data model.&lt;br /&gt;Database Expert (Architect and Administrator)  Design, load, monitor, and tune the data warehouse databases.&lt;br /&gt;Data Quality Analyst  Assess source data quality and prepare data cleansing specifications&lt;br /&gt;for the ETL process.&lt;br /&gt;ETL Lead Developer  Design and oversee the development&lt;br /&gt;of the ETL process.&lt;br /&gt;Metadata Administrator  Build or buy, enhance, load, and maintain the metadata repository.&lt;br /&gt;Project Manager  Define, plan, control, and review all project activities.&lt;br /&gt;Subject Matter Expert  Provide business knowledge about data, processes, business rules, metadata, and requirements.&lt;br /&gt;Technical Architect  Establish and maintain the technical&lt;br /&gt;infrastructure (hardware, network, middleware, system software).&lt;br /&gt;(1) One-hundred percent availability from a business representative is a critical success factor for a data warehouse project. If management resists releasing one businessperson full time, it’s an indication that they don’t support the data warehouse as a critical strategic business initiative.&lt;br /&gt;(2) The business representative role on the core team is usually assigned to the primary business user who represents the business units for which the data warehouse application is being developed. This person must be authorized to make business decisions on behalf of the business community he/she represents.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;3. Failing to Involve the Business People&lt;br /&gt;&lt;br /&gt;Data warehouse projects are notoriously dynamic. That can be good or bad. Usually, it is perceived as bad because requirements change constantly; the scope is hard to control; the timeframes for delivering applications are unreasonably short; the data is usually dirtier than expected; business people are hard to pin down to provide business rules for data cleansing; project team members are often unclear about their specific roles and responsibilities, etc.&lt;br /&gt;&lt;br /&gt;On the other hand, dynamic projects can be good because business people have an opportunity to learn about any new technology or tools early on. They also have an opportunity to “play” with their requirements and adjust them as they learn more about the capabilities and limitations of the data warehouse. IT folks can experiment with different database and application designs, and they can negotiate the project scope and delivery time to be more realistic depending on the difficulties they encounter. They can also profile the source data early and show all the data defects to the business people for resolution or deferment.&lt;br /&gt;&lt;br /&gt;It should be obvious that dynamic projects have to be set up differently from traditional projects if the dynamics are to have a positive impact on the project team or the project schedule. The differences include adoption of a rapid development approach similar to prototyping, acceptance of the software release concept, a self-organizing SWAT team, and full-time involvement of business people in project activities. This is a paradigm shift for how applications are developed, and only a few business and IT people are comfortable with it. In contrast, organizations that successfully practice “extreme programming” techniques understand the benefits of this new approach because the prerequisites for extreme programming are the same (rapid development, software releases, SWAT teams, and participation from business people).&lt;br /&gt;&lt;br /&gt;Why is it so important for business people to participate and what would they be doing on the projects? The most important reason business people should participate is to speed up the development work. It is a common complaint among IT people that situations come up several times a week for which they need input from the business people. But business people don’t often make themselves available. And when they do, it can be too late, especially when weeks have passed and IT has already assumed how best to resolve the situation. Sometimes IT guesses incorrectly, leading to rework, which can impact the project schedule. When combining these situations with changing business requirements that are not subjected to rigorous impact analysis, the result is a frustrated IT team and unhappy business users who don’t understand why IT takes so long to deliver the application. Business people must “live” the projects alongside IT in order for the entire project team to be more productive. Hence, business people must make time to become participating members of data warehouse project teams. They must participate in project planning, perform impact analysis on their own requirement (scope) changes, remove business-related road blocks (like data disputes between business areas), and perform other business-related project activities such as:&lt;br /&gt;&lt;br /&gt;    * Determining project deliverables for each software release&lt;br /&gt;    * Participating in tool evaluation and selection&lt;br /&gt;    * Negotiating data and functional requirements&lt;br /&gt;    * Participating in data and process modeling sessions&lt;br /&gt;    * Providing data definitions and business rules&lt;br /&gt;    * Participating in testing activities, including writing the&lt;br /&gt;      test cases&lt;br /&gt;    * Profiling the source data and validating the quality of data&lt;br /&gt;    * Identifying the cleansing rules for dirty source data&lt;br /&gt;    * Validating/testing the accuracy of the ETL programs&lt;br /&gt;    * Validating/testing the accuracy of reports and queries · Resolving disputes among business units&lt;br /&gt;    * Monitoring/auditing the data warehouse data on an ongoing basis&lt;br /&gt;    * Participating in post-implementation review discussions&lt;br /&gt;&lt;br /&gt;Business people must be told that they are an invaluable and indispensable part of an effective data warehouse project team because they possess certain knowledge and authority their IT counterparts don’t have. In addition, business people understand the severity and monetary impact of their organization’s business problems, and they are the only ones with the position and authority to negotiate the priorities of data warehouse projects.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;4. Failing to Have Application Releases&lt;br /&gt;&lt;br /&gt;The deliverable of a data warehouse project is usually a fully operable data warehouse application with a lot of functionality and a lot of data. The source data typically resides on multiple (and often heterogeneous) operational files or databases, which adds to the complexity of integrating the source data. The amount of data redundancy, data inconsistency, and data defects is habitually underestimated. Project teams who did not plan to spend the majority of their time on data cleansing are caught off guard—especially because they are expected to implement the fully functioning data warehouse application in an extremely short timeframe. In short, many project teams bite off much more than they can chew (i.e., their scope is much too large for their deadline).&lt;br /&gt;&lt;br /&gt;It has been said for years that “you cannot build a whole data warehouse [environment] in one big bang.” Nobody challenges that anymore. But that does not go far enough in the attempt to reduce scope and complexity to a manageable chunk of work. Therefore, the new mantra should be: “you should not build a data warehouse application in one big bang.”&lt;br /&gt;&lt;br /&gt;Following the principles of extreme programming, extreme project management, extreme methodologies, we should also adopt extreme scoping. Extreme scoping means reducing the complexity, and thus, the scope of each project in order to deliver something to the business users in a very short period of time. The first something would equate to only a fraction of the requested fully functioning application, but more functionality and more data would be added rapidly with each subsequent application release. Many business users still balk at this approach and insist on the “minimum required” deliverable, saying that anything less is of no use to them. They don’t realize that not only are they not losing anything, but they are gaining a lot with application releases.&lt;br /&gt;&lt;br /&gt;Building an application in small iterations (application releases) will not take any longer than building the whole enchilada at one time. In addition, the business users can see their application grow and catch mistakes or adjust their requirements if needed (under strict change control procedures). This will greatly enhance the quality of the final fully functioning application. Another benefit of this approach is the opportunity for business people to slowly become familiar with their data warehouse application and any new technology or tools before the application is even completed.&lt;br /&gt;&lt;br /&gt;Maybe the best way to illustrate the effectiveness of extreme scoping with application releases is to recall the concepts of prototyping. In prototyping we focus on a small (partial and incomplete) scope that is not too complex, and we produce a not yet fully functioning application. The next prototype release includes another small portion of the overall scope with a little more complexity and a little more functionality. This process can be repeated until the application is fully functioning. Although application releases are based on the same concepts as prototyping, they are not equal to prototyping. The difference is that traditional prototyping is pure ad hoc development, whereas application releases demand all necessary project activities to be performed with the rigor of a methodology.&lt;br /&gt;&lt;br /&gt;Scoping data warehouse projects will remain a struggle as long as we are married to the idea that a project must produce a fully functioning application. But if we use application releases to build data warehouse applications, controlling the scope becomes much easier because the complexity of each release is reduced, the number of activities performed is decreased, and the project team can be smaller, rendering the methods for controlling the project more effective.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;5. Failing to Have an Active Project Charter&lt;br /&gt;&lt;br /&gt;Most project managers create a short document that describes their project in terms of high-level requirements, users, schedule, resources, and budget. This document is often known as a document of understanding, project agreement, scope agreement, or project charter. Frequently, this document is created by copying an old document from a previous project and changing a few details here and there. Once the project is kicked off, this document disappears into a project manual—never to be seen again.&lt;br /&gt;&lt;br /&gt;A well-thought-out project charter is a very useful instrument and should be used actively to monitor and control project activities during the entire development cycle of a data warehouse project. Therefore, the project manager and the business user, or business sponsor, should spend some time documenting the details of their agreement in this charter. While a detailed project charter can contain as many as 20 sections, the following four sections are the most useful to serve as a baseline for change control:&lt;br /&gt;&lt;br /&gt;    * Scope&lt;br /&gt;    * Risks&lt;br /&gt;    * Assumptions&lt;br /&gt;    * Constraints&lt;br /&gt;&lt;br /&gt;Traditionally, scope has been measured by the number of functions the system will perform (function point analysis)—a sure way to underestimate effort, budget, and resources. Data warehouse applications are data-intensive, not function-intensive. Therefore, scope must be measured by the number of data elements that have to be extracted from the source systems, transformed and cleansed, and loaded into the data warehouse target databases.&lt;br /&gt;&lt;br /&gt;Every project is subject to some risks—risks are unavoidable. Such risks could severely affect the project schedule as well as the project deliverables, depending on the likelihood the risks will materialize and the impact they would have on the data warehouse project. The project manager must identify triggers for each risk and incorporate a risk mitigation plan as well as a contingency plan into the project program.&lt;br /&gt;&lt;br /&gt;An assumption is anything taken for granted; a supposition or a presumption. It’s important to document assumptions because a wrong assumption could very quickly turn into a risk. Important assumptions should always have counterpart risks, in case the assumptions either turn out to be false or do not materialize.&lt;br /&gt;&lt;br /&gt;All projects are subject to the four constraints of scope, effort (time), budget, and resources (capable and available people). In reality, there is a fifth constraint: quality. Although quality is a measure of how well the deliverables meet the requirements, it can also be considered a constraint that must be balanced with the other constraints because higher quality requires more effort and therefore more time to deliver.&lt;br /&gt;&lt;br /&gt;The project charter should record the agreed-upon scope negotiated under the stated risks, assumptions, and constraints. If any of these components change, the entire project has to be reevaluated and renegotiated, and the changes should be reflected in the revised project charter so it can be used as a baseline for the next change request.&lt;br /&gt;&lt;br /&gt;Unfortunately, business managers and IT managers frequently put their project teams under unwarranted pressure to incorporate scope changes without ever referring back to the original agreement in the project charter, and also without performing the necessary impact analysis.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;6. Lack of a Readiness Assessment&lt;br /&gt;&lt;br /&gt;Many project managers attend data warehouse conferences and learn best practices for planning, designing, and implementing data warehouses. But, when they return to their organizations and try to apply the best practices, they often encounter resistance from business users and “uninitiated” IT members. Regardless of how hard the project managers try to educate, convince, or force those who oppose them to follow best practices, they are often unsuccessful.&lt;br /&gt;&lt;br /&gt;The mistake these project managers make is not realizing their organizations aren’t ready to suddenly stop building traditional stand-alone systems and begin building an integrated data warehouse environment. Some organizations are bent on trying every shortcut and silver-bullet solution before they admit those solutions only add to their data chaos.&lt;br /&gt;&lt;br /&gt;Therefore, at the beginning of the initiative, an organization’s readiness should be assessed (understanding, ability, and willingness). A readiness assessment would include the following questions:&lt;br /&gt;&lt;br /&gt;1. Have the goals and objectives been defined?&lt;br /&gt;2. Do the goals and objectives for the data warehouse map to those&lt;br /&gt;of the organization?&lt;br /&gt;3. Has the source data been inventoried and modeled?&lt;br /&gt;4. What is the quality of the source data?&lt;br /&gt;5. Are the skills in place to build and support the data warehouse?&lt;br /&gt;6. Is an adequate budget in place?&lt;br /&gt;7. Has supporting software (ETL, cleansing, DBMS, etc.) been&lt;br /&gt;selected and installed?&lt;br /&gt;8. Is there a strong, well-placed, and reasonable business sponsor?&lt;br /&gt;9. Does the business sponsor understand that a data warehouse is&lt;br /&gt;not a stand-alone system?&lt;br /&gt;10. Are the primary business users computer literate?&lt;br /&gt;11. Are the business users’ expectations realistic?&lt;br /&gt;12. Do the business users understand they have to participate in&lt;br /&gt;project activities?&lt;br /&gt;13. Does the business sponsor accept the approach of building&lt;br /&gt;applications iteratively (using the software release concept)?&lt;br /&gt;&lt;br /&gt;Based on the assessment, the project manager can determine which best practices to implement. Periodically, the questionnaire should be re-distributed to gauge the organization’s understanding of data warehousing. At that time, more best practices can be incorporated.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;7. Inadequate Testing&lt;br /&gt;&lt;br /&gt;Data warehouse testing is often done poorly. It’s unacceptable—and so is the excuse that “it can be fixed in the next release.” If it takes too long to test the data warehouse application properly now, it will take even longer to test it later because the next release will be larger and more complicated.&lt;br /&gt;&lt;br /&gt;The same types of testing activities that apply to operational systems also apply to data warehouse applications, including unit testing, integration (systems) testing, performance (stress) testing, quality assurance testing, and user acceptance testing.&lt;br /&gt;&lt;br /&gt;Unit testing refers to the testing of discrete program modules and scripts. Every developer must test his or her program modules and scripts individually.&lt;br /&gt;&lt;br /&gt;Integration testing tests the complete process. The interactions and flow of all programs must be observed and validated. Every time actual test results do not equal the expected test results, the program producing the error must be corrected, and all programs must be rerun.&lt;br /&gt;&lt;br /&gt;The most complicated and time consuming type of testing is regression testing. The main goal of regression testing is to ensure any modifications to existing programs did not inadvertently produce errors.Performance testing, also known as stress testing, is performed to predict system behavior and performance.&lt;br /&gt;&lt;br /&gt;Performance testing can be limited to only the most critical program modules with the highest volumes of data and the longest runtimes.&lt;br /&gt;&lt;br /&gt;Most large organizations have strict procedures for moving applications into production. These procedures usually include QA testing, at which time the operations staff goes through a simulated production run before allowing the application to transfer to the production environment.&lt;br /&gt;&lt;br /&gt;Acceptance testing is done by business users. They validate the functionality of the data warehouse application.&lt;br /&gt;&lt;br /&gt;With the possible exceptions of unit and performance testing, all other testing activities are controlled by a test plan. The bulk of the plan will be a list of test cases. Each test case specifies the input criteria and the expected output results for each run. It also describes the program logic performed and the appearance of the resulting data.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;8. Underestimating Data Cleansing Efforts&lt;br /&gt;&lt;br /&gt;Most organizations admit they are not paying sufficient attention to their data quality. As disheartening as the situation is with operational source systems, it is very discouraging to see that many IT and business managers are continuing to put pressure on project teams to build data warehouses quicker using the motto: “There’s never enough time to do it right, but always enough time to do it over.”&lt;br /&gt;&lt;br /&gt;With project schedules shrinking and project scopes expanding, project managers are under the gun to deliver more in less time. Therefore, they habitually do not allocate enough time for source data analysis, business rule discovery, data cleansing, data reconciliation, and ETL testing. As a result, two things happen: (1) many data defects propagate into the data warehouse unnoticed, and (2) some dirty data is discovered too late when data exceptions are caught during ETL testing or while loading the data warehouse databases.&lt;br /&gt;&lt;br /&gt;To avoid project delays, the project manager should build sufficient time into the project plan to profile each data element. Common data violations to look for include:&lt;br /&gt;&lt;br /&gt;    * Missing data values&lt;br /&gt;    * Default values that actually have a meaning, e.g., using&lt;br /&gt;      “888-88-8888” as a social security number to indicate&lt;br /&gt;      a non-resident alien&lt;br /&gt;    * Logic embedded in a data value, such as an implied roll-up&lt;br /&gt;      structure, e.g., a 10-digit account number where the first four&lt;br /&gt;      digits are the branch number&lt;br /&gt;    * Cryptic and overused data elements, e.g., using the values&lt;br /&gt;      “A, B, C, D” to mean type of customer, while the values&lt;br /&gt;      “E, F, G, H” mean type of location&lt;br /&gt;    * Multipurpose data elements, e.g., data elements redefined using&lt;br /&gt;      the old COBOL “redefines” clause&lt;br /&gt;    * Contradicting data values among dependent data elements, e.g., “Boston, CA” · Reused primary key, e.g., two different employees (one retired,&lt;br /&gt;      one active) with the same employee number&lt;br /&gt;    * No unique primary key, e.g., one customer with multiple&lt;br /&gt;      customer numbers&lt;br /&gt;    * Objects without their dependent parent object, e.g., job&lt;br /&gt;      assignments for employee 3321, but employee 3321 does not exist in the employee database&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;9. Ignoring Metadata&lt;br /&gt;&lt;br /&gt;Metadata is nothing new—it has always been a part of automated systems. It was found in system documentation, record layouts, database catalogs, and data declaration sections in programs. In fact, it used to be called the dirty “D” word—documentation. Since technicians detested the thought of doing documentation, they often simply didn’t do it.&lt;br /&gt;&lt;br /&gt;However, in a data warehouse environment, metadata takes on a new level of importance. Since one data warehouse objective is to eliminate inconsistencies, data must be standardized. Standardization may result in renaming the data, splitting one source data element into multiple target columns, or populating one target column from multiple source data elements. It can also mean translating codes into mnemonics, standardizing (changing) data values, and filtering out inappropriate or invalid records. At the end of the day, business people will not be able to reconcile their operational source data to the data warehouse data unless they have the ability to trace these changes.&lt;br /&gt;&lt;br /&gt;Therefore, metadata is now the nice “N” word—navigation. It helps the business people locate, manage, understand, and use the data in the data warehouse databases. It describes what data is available in which database, what it means, where it came from, how it was processed, how clean it is, and how it is used in reports and queries. Not delivering any metadata that could help business people navigate through their data warehouse environment is a mistake.&lt;br /&gt;&lt;br /&gt;Granted, implementing a metadata repository has its challenges. Although many data warehouse experts consider metadata to be the “glue” holding the warehouse environment together, most organizations allocate little or no money for creating and maintaining a metadata repository. Also, metadata should be “living” documentation that is constantly updated, which means—at a minimum—one metadata administrator must be assigned full time to manage the metadata repository. But regardless of the challenges, metadata must be an integral part of a data warehouse environment.&lt;br /&gt;&lt;br /&gt;And, since the importance of metadata is still not understood by most business executives, project managers need to do a much better job of promoting metadata and communicating its benefits to the business people.&lt;br /&gt;&lt;br /&gt;Top&lt;br /&gt;&lt;br /&gt;10. Being a Slave to Project Management Tools&lt;br /&gt;&lt;br /&gt;Planning and controlling a project is not a trivial task. It takes a long time to create a work breakdown structure, estimate effort and duration time for all tasks, apply task dependencies and resource dependencies, and determine the critical path. Critical path refers to a string of dependent tasks that cannot be late without affecting the project schedule, as compared to other tasks executed at the same time with slack time built into them. For example, if it takes four days to evaluate a product but only three days to create a project charter (at the same time the product is being evaluated), then the task of evaluating a product is the critical path because it has no leeway in timing. If that task is a day late, the project schedule is impacted.&lt;br /&gt;&lt;br /&gt;Knowing where the critical path tasks are at any point during a project is crucial to staying on track. Since estimates are only best guesses based on prior experience with similar tasks, the actual time it takes to complete a task usually differs from its estimate. These differences can easily change the critical path. Tracking the differences between estimated and actual time, and continuously adjusting the critical path has enslaved more than one project manager to his/her project management tool. Other project managers find this activity too laborious and too tedious, and they stop tracking the critical path altogether.&lt;br /&gt;&lt;br /&gt;The best approach (and compromise) for tracking the critical path is to do it at the milestone level rather than at the task level because there are fewer milestones than there are tasks. Thus, it takes less time and effort to continuously adjust the project plan. The critical path among the tasks between the milestones can be tracked more informally using a whiteboard or a flipchart instead of the project management tool. This approach can be safely and effectively used when the scope of the data warehouse project is very small and the project team is managed by a self-organizing core team (SWAT team) rather than by&lt;br /&gt;a project manager who is not involved in daily project activities.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-5253509382207162859?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/5253509382207162859/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=5253509382207162859' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5253509382207162859'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/5253509382207162859'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/04/mistakes-made-by-data-warehouse-project.html' title='Mistakes made by data warehouse project managers'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-6689492516314986410</id><published>2007-04-20T13:03:00.000-07:00</published><updated>2007-04-20T13:04:19.573-07:00</updated><title type='text'>Disaster Recovery Planning for your Data Warehouse</title><content type='html'>Summary&lt;br /&gt;&lt;br /&gt;Data warehouse systems are increasingly becoming part of the mission-critical infrastructure for an enterprise [1]. As data warehousing evolves from a back-room reporting solution to a front-line source of business intelligence, the service levels for availability of decision support capability inevitably become more aggressive. Information must be available when decisions are required in the enterprise. And, increasingly, decisions are required on a 24x7 basis. The extreme in high-availability implementation is to protect the delivery of business intelligence capability within enterprise from down time caused by natural and human disasters which may take out a full data center. As data warehousing becomes more important to the enterprise at the same time as solutions for redundant systems implementation become more cost-effective, disaster recovery implementation for business intelligence is emerging as a best practice in the industry. In this article, we outline a framework for realization of a disaster recovery plan that can be used across a broad range of availability service levels.&lt;br /&gt;&lt;br /&gt; &lt;br /&gt;Service Level Definitions&lt;br /&gt;&lt;br /&gt;The prerequisite for designing a disaster recovery architecture is to clearly define the service level requirements for the solution. The economics of the solution can vary greatly depending on how aggressive the service levels are defined for handling a disaster recovery scenario. There are four basic questions that need to be answered in order to provide a framework for the disaster recovery solution:&lt;br /&gt;&lt;br /&gt;   1. What amount of time is allowed between when a disaster strikes and when the disaster recovery solution must be capable of taking over the data warehouse workload?&lt;br /&gt;   2. How updated must the data be when operations resume on the disaster recovery solution?&lt;br /&gt;   3. What data is required to be available for data warehouse operations in the disaster recovery solution?&lt;br /&gt;   4. What performance capacity is required for data warehouse operations in the disaster recovery solution?&lt;br /&gt;&lt;br /&gt;There is a wide range of requirements that may be defined in answer to these four simple questions. These answers will dictate the possible solution space for the disaster recovery implementation.&lt;br /&gt;&lt;br /&gt;Imagine a relatively non-aggressive requirement for recovery within 24 hours of a disaster, with data updated at closing the previous night. This scenario allows for a third-party disaster recovery solution. Every night, back-up tapes are shipped to an off-site facility managed by a third-party vendor. When there is a disaster, the backup tapes are used to load up the data into the warehouse and operations are resumed. Of course, disaster recovery capability is not just about re-loading the data warehouse. A full solution also requires network connectivity to the remote site for end users and a means of continued data acquisition into the data warehouse.&lt;br /&gt;&lt;br /&gt;Normally, the third-party vendor will provide recovery services to multiple customers. This effectively lowers the total cost of the solution by "sharing" the equipment (hardware platform, storage devices, tape drives, etc.) and processes required for disaster recovery across many enterprises. Appropriate care needs to be taken so that the disaster recovery solution is not shared among multiple customer sites located in the same city where an earthquake, hurricane, or terrorist event could cause simultaneous demand for the same equipment.&lt;br /&gt;&lt;br /&gt;At the other extreme in terms of a service level scenario, a requirement for instantaneous recovery with completely up-to-date data after a disaster will demand that a "hot" disaster recovery system be available at a remote site for immediate takeover of the data warehouse workload. This disaster recovery system will need to be dedicated, rather than shared, so that its data can be kept completely updated and immediately available for handling recovery of the data warehouse workload. This more aggressive service level requirement will inevitably lead to a higher cost which must be balanced against the economic value of an immediate recovery capability for the data warehouse. Availability of a "hot" disaster recovery system is also a convenient vehicle for handling planned down time on one or the other system in the configuration without visible outages to the end-user community.&lt;br /&gt;&lt;br /&gt;Rather than blindly applying the disaster recovery solution to the complete data warehouse, it is often the case that significant cost savings can be achieved by architecting a disaster recovery capability for only a subset of the data warehouse workload. An inventory of data warehouse applications should be undertaken in which workloads are classified into "critical for recovery" versus "non-critical" for the purposes of inclusion in the disaster recovery solution. Typically, not all data and not all performance capacity is required in a disaster recovery scenario. Our experience is that by rigorously classifying workload into "critical" and "non-critical," the solution configuration for disaster recovery is usually less than 50 percent of the capacity (in storage and processing capability) as compared to the steady state production configuration.&lt;br /&gt;&lt;br /&gt;Under the extreme circumstances of a disaster situation, an enterprise is normally willing to give up some less-critical reports and analyses. After a disaster, running at less than full performance capacity with less than full detail and/or history in the data warehouse will be acceptable in the short term as long as there is a promise to restore full operations in some reasonable period of time. The resumption time of full-scale production operations must also be part of the service level definition for the solution when the disaster recovery system is less capable than the system used under normal operations.&lt;br /&gt;&lt;br /&gt; &lt;br /&gt;Data Synchronization&lt;br /&gt;&lt;br /&gt;When service levels allow several hours to achieve disaster recovery, it may be feasible to restore operations from the backup tapes that were regularly shipped to the off-site recovery center. However, more often than not, the emerging service level requirement for disaster recovery is nearly immediate. Even if it isn't a requirement from day one, it will usually be a requirement in the near future. Under such conditions, recovery from back-up tapes will not be acceptable. A strategy for keeping the data updated at the off-site system is required for more immediate recovery capability. This requires data synchronization between the primary and secondary data warehouse systems. Data replication is a common means to achieve synchronization between the primary and secondary systems in a disaster recovery architecture. Replication can be performed using a variety of methods. Implementation at the RDBMS level involves capture of all database insert, update, delete, table/view creation, etc. operations and replicating to a second instance of the physical database. Replication at the database level can be implemented using peer-to-peer or master-slave configurations and can be implemented either synchronously or asynchronously, depending on the business requirements. Implementation at the device driver level involves replication of logical I/Os at the disaster recovery site. Replication at the device driver level is generally implemented in a master-slave configuration. Implementation at the disk controller level involves replication of physical I/Os at the disaster recovery site and is also implemented only with master-slave configurations.&lt;br /&gt;&lt;br /&gt;The choice of replication strategy involves trade-offs between complexity, performance, and cost. Depending on the data warehouse platform, different solutions will have more or less desirability along the lines of these trade offs. Some platforms will have a built-in solution and others will require a combination of systems integration and third-party tools. One aspect of implementation solutions to be acutely aware of is whether there is a requirement to take primary system off-line for the purpose of updates to achieve replication. Our experience is that making the system unavailable, even if only for update activity, to execute replication services is unacceptable as a disaster recovery solution. Make sure that your solution of choice will not cause more downtime than it is meant to prevent!&lt;br /&gt;&lt;br /&gt;While a replication strategy is a quite useful strategy for capturing low-volume changes from the primary system and propagating them to the disaster recovery site, this is not such a great strategy for high-volume data loading. When replication is used for high-volume data acquisition it ends up putting undue capacity stress on the primary system. Not only does the primary system need to apply database updates to its copy of the data, it is also required to propagate the I/O (whether at the database, device driver, or disk controller level) to the disaster recovery system.&lt;br /&gt;&lt;br /&gt;For high-volume updates, it is more efficient to use a dual load strategy for data synchronization. In this way, both systems are independently updated from a capacity management perspective. Typically, the updates are performed asynchronously across the primary and secondary system. Usually, a small window of difference between the two "copies" of the data warehouse is permissible from a business perspective, as long as eventual convergence with data integrity between the two systems is guaranteed upon quiescence of the update stream. Two-phase commit with synchronous data loading into the two copies of a data warehouse is not practical for delivering high performance when large volume data acquisition is involved. Checkpoint, re-start capability is essential to ensure that the two systems can be re-synchronized if one system fails. The best strategy to facilitate a dual load implementation is to take the output of the ETL (extract, transform, and load) transformation phase and feed it into parallel load operators provided by either the ETL tool or RDBMS vendor.&lt;br /&gt;&lt;br /&gt; &lt;br /&gt;Dual Live Implementation&lt;br /&gt;&lt;br /&gt;Recent implementations of disaster recovery have taken a different architectural approach than in the past. Old style implementations typically implemented the disaster recovery solution as a hot standby system with switch-over capability if the primary system were to be incapacitated. This style of implementation means that the hot standby system is not being used "productively" except when a disaster occurs.&lt;br /&gt;&lt;br /&gt;In more progressive implementations, both the primary and secondary systems in a disaster recovery configuration would be "live" and "productive" for executing data warehousing workloads. In situations where service levels for recovery require both the primary and secondary systems to be up-to-date in data content, it makes sense to put the secondary system to work by load balancing across the two systems. Greater throughput on the data warehousing workload is achieved when both systems are working together rather than having the secondary system sitting by in idle mode waiting for a disaster to occur.&lt;br /&gt;&lt;br /&gt;Middleware can be used to transparently load balance queries across the primary and secondary systems. The middleware will need to be aware of relative performance capacity on the two systems, as well as data content present on each system when the two environments are not complete mirrors of each other, to ensure optimal assignment of queries across the two systems. In addition, some good engineering must be put into place to ensure that load balancing is done efficiently in the context of minimizing replication of temporary tables across the two systems and defining the unit of recoverable work in an appropriate way so as to balance recovery time and efficiency.&lt;br /&gt;&lt;br /&gt;Of course, in the case of a disaster at one of the two data centers, all of the workload will need to fail over to the surviving system. One danger when sharing workload across two systems is that end users begin to take the additional performance for granted. Careful capacity planning must be undertaken to ensure that if one of the two systems goes down there will still be enough performance capacity in the surviving system to execute the critical workload for the enterprise.&lt;br /&gt; &lt;br /&gt;Conclusions&lt;br /&gt;&lt;br /&gt;While disaster recovery capability for a data warehouse was typically far from our thoughts a few years ago, it is becoming reality as an emerging requirement in today's business environment. There are many implementation scenarios possible, and the choice for an organization must be driven by the economic trade offs (costs and benefits) associated with different levels of availability for the business intelligence environment. Sophisticated implementations of a disaster recovery capability will use the solution for "hiding" planned down time and increasing throughput in the data warehouse environment by putting the disaster recovery system to work even when a disaster has not occurred.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-6689492516314986410?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/6689492516314986410/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=6689492516314986410' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6689492516314986410'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/6689492516314986410'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/04/disaster-recovery-planning-for-your.html' title='Disaster Recovery Planning for your Data Warehouse'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-1269646469360189552</id><published>2007-02-06T12:54:00.000-08:00</published><updated>2007-04-20T13:08:16.449-07:00</updated><title type='text'>Best Practices for Data Warehouse Database Developers</title><content type='html'>Best Practices for Data Warehouse Database Developers&lt;br /&gt;&lt;br /&gt;By nature, data warehouse projects are costly endeavors where many resources are consumed - both hardware and software. Although, experienced database developers may use 90 percent of the same skills on their last projects, they may find themselves approaching development issues differently!&lt;br /&gt;&lt;br /&gt;While the project team as a whole usually understands the risks and benefits of building a data warehouse, I find that not enough IT managers are initially aware of the unique challenges a data warehouse effort poses for database developers. Operating in a very large database (VLDB) multigigabyte to terabyte decision support system (DSS) data environment will often require unique approaches for database developers, where strategic value can be added to the warehouse development cycle. Understanding and anticipating the kinds of challenges developers will face in a DSS database is essential. This article will address a few of these challenges.&lt;br /&gt;&lt;br /&gt;Make sure you are provided with a usable data dictionary before starting heavy-duty development. Many data warehouse projects suffer from time constraints, so it is not uncommon for some area of construction on the system database and corresponding development tasks to commence while other tasks in the analysis domain - business user interviews, requirements gathering, source to target analysis, etc. - are still being conducted in parallel. On these types of data warehouse initiatives, the developers seem to be perpetually playing detective - iteratively asking questions about data mapping, validation ranges, aggregation and related semantics during their coding of procedures, triggers, queries, application programming interfaces (APIs), ETL (extract, transform and load) scripts, and so on. While there may certainly be crossover between the gathering of systems requirements for a data warehouse and the construction of a data dictionary, some sort of data dictionary should be in place before any critical coding or database development takes place. As this lexicon of corporate data meanings and semantics grows, the corporate data steward should see to it that things such as rules, validations and domain ranges are added, giving rise to a true enterprise dictionary. The data dictionary should be stored on the corporate intranet and available to both business users and developers alike.&lt;br /&gt;&lt;br /&gt;Warehouses that are built without a useful data dictionary will often result in physical functional areas sharing common data elements, duplication of coding effort, increased redundant data and confusion and communication problems for the developers. The data dictionary should be stored in a meta data repository database, and a concerted effort should be made to merge and tie in the information with your ETL tool's meta data (for example, source and target mappings). Developers will be glad they have one place to find mappings, data meanings, validations, domains, aggregation rules, etc. Without an industry standard on ETL meta data, this may be easier said than done; nevertheless, the days of keeping the data dictionary solely in a spreadsheet on a file server should be over!&lt;br /&gt;&lt;br /&gt;2. Save query plans, run times and performance benchmarks in the database. Storing processing performance information and benchmarking data in the database can be done quite easily, although it is often an afterthought in many data warehouses. For example, recording a process start time and end time for every critical batch or processing task in the warehouse can easily be implemented via such things as stored procedures, shell scripts or ETL tool tasks that serve as wrapper or control objects. These process control components become responsible for recording execution and completion statistics as they execute the critical processes in the data warehouse. Why keep benchmarks? Saving benchmarking data in the database helps pinpoint performance problems by establishing foundations of mean/median run times. This helps the team focus on tuning opportunities and gives direction on things such as hardware load balancing, troubleshooting, SLA agreement expectations and facilitates better practices on the maintenance of your system.&lt;br /&gt;&lt;br /&gt;Keeping process benchmarks as part of your meta data is a logical extension of a robust meta data repository, providing information about your warehouse processes - job sequence, parameters, run-times - in one physical place. Remember that meta data should not just be data about your business- oriented data, source target mappings, etc.; it is also data about your warehouse processes. You could ameliorate this approach to track user activity, identifying bottlenecks and most-used queries by grabbing statistics on query start and end times, most-used queries, number of reads on the database, number of rows returned per query and more.&lt;br /&gt;&lt;br /&gt;3. Save ETL, validation and processing errors in shared database tables. Similar to the previous approach is the practice of properly trapping all data warehouse processing errors in database tables. Nobody should have to wade through error logs and error tables marooned in multiple environments. All errors should be trapped, consolidated and sent to one place - your meta data repository. This means that any errors that occur in the domain of the ETL tool are logged with any errors encountered in the post-ETL tool load process, whether it be from things such as loading the operational data store (ODS) or building the online analytical processing (OLAP) cube. It is important to establish error thresholds for each process in the data warehouse as well as what actions to take when those error thresholds are encountered. This is usually one area where requirements gathering falls short; nevertheless, veteran data warehouse developers will want answers about this information fairly early in the development process. E-mail notification of any errors that exceed predetermined thresholds should be the goal of any robust data warehouse.&lt;br /&gt;&lt;br /&gt;4. Avoid long-running transactions. In your online transaction processing (OLTP) applications, you did not have to worry so much about long- running transactions. However, now those data manipulation language (DML) operations on millions of rows may fill up the database's transaction log, bringing your development or batch processing to a standstill. If you are writing stored procedures, keep them modular with respect to each unit of work, and break your transactions into more granular operations. This will also give you more leverage over error failure - as you will have less to roll back when an error condition strikes, and you can isolate your errors more easily. Also, remember that you are dealing with millions of rows. All those long-running transactions may hold locks on precious data, slowing a parallel load of your database to a crawl.&lt;br /&gt;&lt;br /&gt;5. Use referential integrity carefully. Beware of the pitfalls of using all the of referential integrity (RI) bells and whistles of your relational database management system (RDBMS); always know the performance tradeoffs with RI. While foreign key constraints help data integrity, they have an associated cost on all insert, update and delete statements. Give careful attention to the use of constraints in your warehouse or ODS when you wish to ensure data integrity and validation. Also consider the advantages of implementing certain types of validations and check constraints in your ETL tool or data staging area. While triggers are a godsend in OLTP, they may slow mass inserts into your VLDB considerably, as every row inserted will fire its corresponding trigger once.&lt;br /&gt;&lt;br /&gt;6. Learn to recognize when the law of diminishing returns is in effect. Sometimes "good enough" performance is acceptable. Avoid the urge to perform endless incremental improvements in the optimization of your database code. Many times as a matter of pride or competition, developers try to keep tuning structured query language (SQL) or other code when, in fact, the run times of the current batch processes fit comfortably into existing batch windows. Although, this may be the simplest concept in the article, it remains very difficult for many developers to grasp. Information technology exists to support the business and its processes in a constrained time arena; know the service level agreements you have with your business users and exactly what types of improvements will help you meet or keep your acceptable levels of service.&lt;br /&gt;&lt;br /&gt;7. Always understand your database's optimizer and query plans. Everybody knows that random-access memory (RAM) access/logical reads are always cheaper than physical disk access, yet I am always amazed at the lack of understanding and attention given to such things as query plans and I/O statistics analysis. All developers writing SQL operations against a VLDB should know how to create and decipher a database's query plan and be able to tune all data manipulation statements for best possible performance. When I encounter a data warehouse schema for the first time and I want to issue a SQL statement, I always try to find out as much as I can about the nature (business meanings, storage, indexes, etc.) of the data. Before I execute any queries against the data warehouse, I first compile them and then run them (non-exec mode) with the query plan in effect. Only when I am comfortable that I am covering indexes, issuing the correct joins and getting good I/O statistics, will I execute the query. If I am just trying to get "acquainted" with the data, I will limit my result sets so that only enough rows are returned as to provide me with some clues about the nature of what the data means, in the real world empirical sense. This approach has saved me many trips to the DBA on duty to ask him or her to kindly kill my runaway processes or Cartesian product of the day.&lt;br /&gt;&lt;br /&gt;Be aware that some of those DML operations in your repertoire that may have been fine on an OLTP order-entry system may not work in a huge, historically archived database. For instance, if you are now inserting 6 million rows en masse from an ETL tool, you should be aware of the repercussions that clustered indexes may have on your operation - the possibility that your load methodology will require the database optimizer to reorder/split some of your physical data on each insert. Even worse, updating field values that participate in a clustered index may take forever, as each updated row must be physically moved so that its location conforms to the order specified by the index.&lt;br /&gt;&lt;br /&gt;8. Know the limitations of your ETL tool. Before you begin serious development with your ETL tool of choice, be aware of all of its limitations and how to work around them. To give an example, many ETL tools require advanced coding practices to go from long flat file structures to various types of normalized RDBMS table structures. Therefore, you may have to output DML from your ETL tool into a SQL-esque log file, parse the log file and then use the parsed file to perform inserts into your warehouse database.&lt;br /&gt;&lt;br /&gt;Also keep in mind that many ETL tools - robust as they are - do not have a meta data repository that integrates easily with your enterprise repository, making it hard to change tools in midstream. Never underestimate the integration challenges that may arise when tackling your meta data requirements.&lt;br /&gt;&lt;br /&gt;9. Be involved in planning physical environments for testing, QA and migration. Fundamentally speaking, version control and change management practices for a data warehouse are virtually identical to a normal non-DSS environment. Developer access should be restricted to the production database as database code, scripts and objects should be checked from a repository - not just grabbed from production. A much more daunting task is deciding how to re-create the physical data warehouse environment so that developers get a true test and quality assurance (QA) environment separate from production. Given the huge volume of data that a warehouse contains, as well as all the sundry applications and pieces that make up its architecture, this may prove too costly to do, resulting in shortcuts or sharing architectural components between QA, test and production environments. In this case, even more thought should be given to where exactly the developer will be able to develop, perform QA and migrate new code or bug fixes. It is not uncommon for a warehouse project to be very far along before serious thought is given to migration processes and environments in which developers will conduct the maintenance and test of code because the focus tends to be on the production environment. The opposite approach of "cutting over" from development to production can be just as bad, not to mention risky. A savvy developer will start raising questions concerning the need for multiple physical environments early in the project. After all, he or she will be working every day with the physical setting provided.&lt;br /&gt;&lt;br /&gt;Bear in mind that I have only scratched the surface of best practices for data warehouse developers. IT managers, project leaders and developers who are involved with their companies' or clients' warehousing efforts should become acquainted with these issues and sundry related considerations. While every subtopic listed could warrant its own in-depth article, an understanding of these topics will go a long way to ensure success for database developers in a data warehouse environment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-1269646469360189552?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/1269646469360189552/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=1269646469360189552' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/1269646469360189552'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/1269646469360189552'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/02/best-practices-for-data-warehouse.html' title='Best Practices for Data Warehouse Database Developers'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-339980569642802295</id><published>2007-02-06T12:30:00.000-08:00</published><updated>2007-02-06T12:31:37.387-08:00</updated><title type='text'>Design Methodologies of Kimball and Inmon...Plus a Third Way</title><content type='html'>&lt;script type="text/javascript"&gt; //&lt;![CDATA[ &lt;!-- google_ad_client = "pub-6910596612166738"; google_ad_width = 728; google_ad_height = 90; google_ad_format = "728x90_as"; google_ad_channel =""; google_ad_type = "text_image"; google_color_border = "6699CC"; google_color_bg = "FFFFFF"; google_color_link = "0000FF"; google_color_url = "008000"; google_color_text = "000000"; //--&gt;//]]&amp;gt; &lt;/script&gt;y Hari Mailvaganam &lt;script src="http://pagead2.googlesyndication.com/pagead/show_ads.js"  type="text/javascript"&gt; //&lt;![CDATA[ &lt;!-- google_ad_client = "pub-6910596612166738"; google_ad_width = 728; google_ad_height = 90; google_ad_format = "728x90_as"; google_ad_channel =""; google_ad_type = "text_image"; google_color_border = "6699CC"; google_color_bg = "FFFFFF"; google_color_link = "0000FF"; google_color_url = "008000"; google_color_text = "000000"; //--&gt; //]]&amp;gt; &lt;/script&gt; &lt;p&gt;Data warehousing is more an art-form than cookie cutter science. The business variables and technical risks are very unique to each installation. The business users have different goals and expectations. Data warehousing is more often successful than not if there is a reservoir of data warehousing expertise in-house.&lt;/p&gt; &lt;p&gt;This article will focus on the data warehousing design methodologies most commonly proposed. These designs are in an evolving flux as business needs and technical cost change.&lt;/p&gt; &lt;p&gt;Quite often the design chosen will be a combination of the methodologies below and additional requirements&amp;nbsp; - the data warehouse design &lt;strong&gt;third way&lt;/strong&gt;. I am a proponent of the third way data warehousing design. Third way takes into account the business specifics and needs of the installing company and technical resources available. It uses the best design patterns of both methodologies plus additional requirements unique to the business.&lt;/p&gt; &lt;p&gt;The two major design methodologies of data warehousing are from Ralph Kimball and Bill Inmon. The design methodologies developed by Kimball and Inmon have lines drawn in the sand.&amp;nbsp;&lt;/p&gt; &lt;p&gt;Both Kimball and Inmon view data warehousing as separate from OLTP and Legacy applications.&lt;/p&gt; &lt;p&gt;Kimball views data warehousing as a constituency of data marts. Data marts are focused on delivering business objectives for departments in the organization. And the data warehouse is a conformed dimension of the data marts. Hence a unified view of the enterprise can be obtain from the dimension modeling on a local departmental level.&lt;/p&gt; &lt;p align="center"&gt;&lt;img alt=""  src="cid:part1.05090503.05000705@redhat.com" border="0" height="282"  width="348"&gt;&lt;/p&gt; &lt;p align="center"&gt;Figure 1. Kimball's Data Warehousing Design Methodology&lt;/p&gt; &lt;p align="left"&gt;Inmon beliefs in creating a data warehouse on a subject-by-subject area basis. Hence the development of the data warehouse can start with data from the online store. Other subject areas can be added to the data warehouse as their needs arise. Point-of-sale (POS) data can be added later if management decides it is necessary.&lt;/p&gt; &lt;p align="left"&gt;The data mart is the creation of a data warehouse's subject area.&lt;/p&gt; &lt;p align="center"&gt;&lt;img alt=""  src="cid:part2.04090403.03000503@redhat.com" border="0" height="253"  width="371"&gt;&lt;/p&gt; &lt;p align="center"&gt;Figure 2. &amp;nbsp;Inmon's Data Warehouse Design Methodology&lt;/p&gt; &lt;p align="left"&gt;There are pros and cons to both approaches. And there are third ways that can be unique to an enterprise's needs. Please &lt;a  href="http://www.dwreview.com/Contact.html" target="_blank"&gt;contact us &lt;/a&gt; if you would like to have more information.&lt;/p&gt; &lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-339980569642802295?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/339980569642802295/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=339980569642802295' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/339980569642802295'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/339980569642802295'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/02/design-methodologies-of-kimball-and.html' title='Design Methodologies of Kimball and Inmon...Plus a Third Way'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-2387670469462939744.post-4393598520767031944</id><published>2007-01-11T07:44:00.000-08:00</published><updated>2007-01-11T07:48:44.170-08:00</updated><title type='text'>Business Intelligence</title><content type='html'>Jan-11:&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/2387670469462939744-4393598520767031944?l=rht-bi.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://rht-bi.blogspot.com/feeds/4393598520767031944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=2387670469462939744&amp;postID=4393598520767031944' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4393598520767031944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/2387670469462939744/posts/default/4393598520767031944'/><link rel='alternate' type='text/html' href='http://rht-bi.blogspot.com/2007/01/business-intelligence.html' title='Business Intelligence'/><author><name>Tidhi</name><uri>http://www.blogger.com/profile/11392471881762637742</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://4.bp.blogspot.com/_o3SPD6mM7vA/SRgsPVAw8JI/AAAAAAAAAq8/2MjjJxF3JQg/S220/balaram.jpg'/></author><thr:total>0</thr:total></entry></feed>
