SQL Profiler is a graphical tool that allows system administrators to monitor events in an instance of Microsoft® SQL Server™. You can capture and save data about each event to a file or SQL Server table to analyze later. For example, you can monitor a production environment to see which stored procedures are hampering performance by executing too slowly.
Use SQL Profiler to monitor only the events in which you are interested. If traces are becoming too large, you can filter them based on the information you want, so that only a subset of the event data is collected. Monitoring too many events adds overhead to the server and the monitoring process and can cause the trace file or trace table to grow very large, especially when the monitoring process takes place over a long period of time.
After you have traced events, SQL Profiler allows captured event data to be replayed against an instance of SQL Server, thereby effectively reexecuting the saved events as they occurred originally.
Use SQL Profiler to:
Monitor the performance of an instance of SQL Server.
Debug Transact-SQL statements and stored procedures.
Identify slow-executing queries.
Test SQL statements and stored procedures in the development phase of a project by single-stepping through statements to confirm that the code works as expected.
Troubleshoot problems in SQL Server by capturing events on a production system and replaying them on a test system. This is useful for testing or debugging purposes and allows users to continue using the production system without interference.
Audit and review activity that occurred on an instance of SQL Server. This allows a security administrator to review any of the auditing events, including the success and failure of a login attempt and the success and failure of permissions in accessing statements and objects.
SQL Profiler provides a graphical user interface to a set of stored procedures that can be used to monitor an instance of SQL Server. For example, it is possible to create your own application that uses SQL Profiler stored procedures to monitor SQL Server.
You must have at least 10 megabytes (MB) of free space to run SQL Profiler. If free space drops below 10 MB while you are using SQL Profiler, all SQL Profiler functions will stop
search software
Showing posts with label Data Base. Show all posts
Showing posts with label Data Base. Show all posts
SQL Profiler Terminology
SQL Profiler Terminology
To use SQL Profiler, you need to understand the terminology that describes the way the tool functions. For example, you create a template that defines the data you want to collect. You collect this data by running a trace on the events defined in the template. While the trace is running, the event classes and data columns that describe the event data are displayed in SQL Profiler.
Template
A template defines the criteria for each event you want to monitor with SQL Profiler. For example, you can create a template, specifying which events, data columns, and filters to use. Then you can save the template and launch a trace with the current template settings. The trace data captured is based upon the options specified in the template. A template is not executed, and must be saved to a file with the .tdf extension.
Trace
A trace captures data based upon the selected events, data columns, and filters. For example, you can create a template to monitor exception errors. To do this, you would select to trace the Exception event class, and the Error, State, and Severity data columns, which need to be collected for the trace results to provide meaningful data. After you save the template, you can then run it as a trace, and collect data on any Exception events that occur in the server. This trace data can be saved and then replayed at a later date, or used immediately for analysis.
Filter
When you create a trace or template, you can define criteria to filter the data collected by the event. If traces are becoming too large, you can filter them based on the information you want, so that only a subset of the event data is collected. If a filter is not set, all events of the selected event classes are returned in the trace output. For example, you can limit the Microsoft® Windows® 2000 user names in the trace to specific users, reducing the output data to only those users in which you are interested.
Event Category
An event category defines the way events are grouped. For example, all lock events classes are grouped within the Locks event category. However, event categories only exist within SQL Profiler. This term does not reflect the way engine events are grouped.
Event
An event is an action generated within the Microsoft SQL Server™ engine. For example:
The login connections, failures, and disconnections.
The Transact-SQL SELECT, INSERT, UPDATE, and DELETE statements.
The remote procedure call (RPC) batch status.
The start or end of a stored procedure.
The start or end of statements within stored procedures.
The start or end of an SQL batch.
An error written to the SQL Server error log.
A lock acquired or released on a database object.
An opened cursor.
Security permissions checks.
All of the data that is generated as a result of an event is displayed in the trace in a single row. This row contains columns of data called event classes that describe the event in detail.
Event Class
An event class is the column that describes the event that was produced by the server. The event class determines the type of data collected, and not all data columns are applicable to all event classes. Examples of event classes include:
SQL:BatchCompleted, which indicates the completion of an SQL batch.
The name of the computer on which the client is running.
The ID of the object affected by the event, such as a table name.
The SQL Server name of the user issuing the statement.
The text of the Transact-SQL statement or stored procedure being executed.
The time the event started and ended.
Data Column
The data columns describe the data collected for each of the event classes captured in the trace. Because the event class determines the type of data collected, not all data columns are applicable to all event classes. For example, the Binary Data data column, when captured for the Lock:Acquired event class, contains the value of the locked page ID or row but has no value for the Integer Data event class. Default data columns are populated automatically for all event classes.
To use SQL Profiler, you need to understand the terminology that describes the way the tool functions. For example, you create a template that defines the data you want to collect. You collect this data by running a trace on the events defined in the template. While the trace is running, the event classes and data columns that describe the event data are displayed in SQL Profiler.
Template
A template defines the criteria for each event you want to monitor with SQL Profiler. For example, you can create a template, specifying which events, data columns, and filters to use. Then you can save the template and launch a trace with the current template settings. The trace data captured is based upon the options specified in the template. A template is not executed, and must be saved to a file with the .tdf extension.
Trace
A trace captures data based upon the selected events, data columns, and filters. For example, you can create a template to monitor exception errors. To do this, you would select to trace the Exception event class, and the Error, State, and Severity data columns, which need to be collected for the trace results to provide meaningful data. After you save the template, you can then run it as a trace, and collect data on any Exception events that occur in the server. This trace data can be saved and then replayed at a later date, or used immediately for analysis.
Filter
When you create a trace or template, you can define criteria to filter the data collected by the event. If traces are becoming too large, you can filter them based on the information you want, so that only a subset of the event data is collected. If a filter is not set, all events of the selected event classes are returned in the trace output. For example, you can limit the Microsoft® Windows® 2000 user names in the trace to specific users, reducing the output data to only those users in which you are interested.
Event Category
An event category defines the way events are grouped. For example, all lock events classes are grouped within the Locks event category. However, event categories only exist within SQL Profiler. This term does not reflect the way engine events are grouped.
Event
An event is an action generated within the Microsoft SQL Server™ engine. For example:
The login connections, failures, and disconnections.
The Transact-SQL SELECT, INSERT, UPDATE, and DELETE statements.
The remote procedure call (RPC) batch status.
The start or end of a stored procedure.
The start or end of statements within stored procedures.
The start or end of an SQL batch.
An error written to the SQL Server error log.
A lock acquired or released on a database object.
An opened cursor.
Security permissions checks.
All of the data that is generated as a result of an event is displayed in the trace in a single row. This row contains columns of data called event classes that describe the event in detail.
Event Class
An event class is the column that describes the event that was produced by the server. The event class determines the type of data collected, and not all data columns are applicable to all event classes. Examples of event classes include:
SQL:BatchCompleted, which indicates the completion of an SQL batch.
The name of the computer on which the client is running.
The ID of the object affected by the event, such as a table name.
The SQL Server name of the user issuing the statement.
The text of the Transact-SQL statement or stored procedure being executed.
The time the event started and ended.
Data Column
The data columns describe the data collected for each of the event classes captured in the trace. Because the event class determines the type of data collected, not all data columns are applicable to all event classes. For example, the Binary Data data column, when captured for the Lock:Acquired event class, contains the value of the locked page ID or row but has no value for the Integer Data event class. Default data columns are populated automatically for all event classes.
Monitoring with SQL Profiler
Administering SQL Server
Monitoring with SQL Profiler
SQL Profiler is a graphical tool that allows system administrators to monitor events in an instance of Microsoft® SQL Server™. You can capture and save data about each event to a file or SQL Server table to analyze later. For example, you can monitor a production environment to see which stored procedures are hampering performance by executing too slowly.
Use SQL Profiler to monitor only the events in which you are interested. If traces are becoming too large, you can filter them based on the information you want, so that only a subset of the event data is collected. Monitoring too many events adds overhead to the server and the monitoring process and can cause the trace file or trace table to grow very large, especially when the monitoring process takes place over a long period of time.
After you have traced events, SQL Profiler allows captured event data to be replayed against an instance of SQL Server, thereby effectively reexecuting the saved events as they occurred originally.
Use SQL Profiler to:
Monitor the performance of an instance of SQL Server.
Debug Transact-SQL statements and stored procedures.
Identify slow-executing queries.
Test SQL statements and stored procedures in the development phase of a project by single-stepping through statements to confirm that the code works as expected.
Troubleshoot problems in SQL Server by capturing events on a production system and replaying them on a test system. This is useful for testing or debugging purposes and allows users to continue using the production system without interference.
Audit and review activity that occurred on an instance of SQL Server. This allows a security administrator to review any of the auditing events, including the success and failure of a login attempt and the success and failure of permissions in accessing statements and objects.
SQL Profiler provides a graphical user interface to a set of stored procedures that can be used to monitor an instance of SQL Server. For example, it is possible to create your own application that uses SQL Profiler stored procedures to monitor SQL Server.
You must have at least 10 megabytes (MB) of free space to run SQL Profiler. If free space drops below 10 MB while you are using SQL Profiler, all SQL Profiler functions will stop.
Starting SQL Profiler
SQL Profiler is started from the Microsoft® Windows NT® 4.0, Microsoft Windows® 2000 or Microsoft Windows 98 Start menu, or from SQL Server Enterprise Manager.
Monitoring with SQL Profiler
SQL Profiler is a graphical tool that allows system administrators to monitor events in an instance of Microsoft® SQL Server™. You can capture and save data about each event to a file or SQL Server table to analyze later. For example, you can monitor a production environment to see which stored procedures are hampering performance by executing too slowly.
Use SQL Profiler to monitor only the events in which you are interested. If traces are becoming too large, you can filter them based on the information you want, so that only a subset of the event data is collected. Monitoring too many events adds overhead to the server and the monitoring process and can cause the trace file or trace table to grow very large, especially when the monitoring process takes place over a long period of time.
After you have traced events, SQL Profiler allows captured event data to be replayed against an instance of SQL Server, thereby effectively reexecuting the saved events as they occurred originally.
Use SQL Profiler to:
Monitor the performance of an instance of SQL Server.
Debug Transact-SQL statements and stored procedures.
Identify slow-executing queries.
Test SQL statements and stored procedures in the development phase of a project by single-stepping through statements to confirm that the code works as expected.
Troubleshoot problems in SQL Server by capturing events on a production system and replaying them on a test system. This is useful for testing or debugging purposes and allows users to continue using the production system without interference.
Audit and review activity that occurred on an instance of SQL Server. This allows a security administrator to review any of the auditing events, including the success and failure of a login attempt and the success and failure of permissions in accessing statements and objects.
SQL Profiler provides a graphical user interface to a set of stored procedures that can be used to monitor an instance of SQL Server. For example, it is possible to create your own application that uses SQL Profiler stored procedures to monitor SQL Server.
You must have at least 10 megabytes (MB) of free space to run SQL Profiler. If free space drops below 10 MB while you are using SQL Profiler, all SQL Profiler functions will stop.
Starting SQL Profiler
SQL Profiler is started from the Microsoft® Windows NT® 4.0, Microsoft Windows® 2000 or Microsoft Windows 98 Start menu, or from SQL Server Enterprise Manager.
Archiving and Restoring Databases ms Sql Server 2000
Archiving and Restoring Databases
An archive file for a Microsoft® SQL Server™ 2000 Analysis Services database contains the contents of the directory named the same as the database. For example, the archive file for the Yourdata sample database stores the contents of the Yourdata directory. The default path of this directory is:
C:\Program Files\Microsoft Analysis Services\Data\yourdata
All Analysis Services databases can be found in the Data directory. The path leading to the Data directory can be changed. To determine the current path, in Analysis Manager, right-click the server that contains the database, click Properties, and then see the Data folder box.
Important The subdirectories of the Data directory store security files that control end users' access to objects on the Analysis server. These files are included in the archive files. For this reason, archive files must be secured against unauthorized access.
The archive file also stores meta data for the database and its objects. The appropriate records from the Analysis Services repository are included in the archive file. By default the Analysis Services repository is:
C:\Program Files\Microsoft Analysis Services\Bin\msmdrep.mdb
However, the path of the repository can be changed at installation, and the repository can be migrated to a SQL Server database.
Caution When you restore a database, its file set (in the Data directory and its subdirectories) and its meta data are returned to their states at the time the archive file was created. Files that were created since then in these directories are deleted. Changes and additions since then to Analysis Services repository records for the database and its objects are removed.
When you restore a database:
• The files in the selected archive file are restored to the appropriate directories in the Data directory. If these directories already contain files with the same names as files in the archive file, the existing files are overwritten by the files in the archive file. If these directories contain files with different names, including new files that were created since the archive file was created, these files are deleted.
• The records in the Analysis Services repository that are associated with the database and its objects are replaced by the Analysis Services repository records in the selected archive file.
Sometimes restoration of a database containing relational OLAP (ROLAP) partitions copies rather than replaces the original, archived partitions. When this happens, the original ROLAP partitions and their copies will use the same aggregation tables. This situation may cause overwrite conflicts in the tables. For example, if you change and process an original partition, the aggregation tables may no longer be valid for the copy of that partition. To avoid this problem, specify a unique aggregation prefix for each ROLAP partition immediately after restoration, and then process the partitions whose aggregation prefixes you changed. This action creates different aggregation tables for each partition.
An archive file for a Microsoft® SQL Server™ 2000 Analysis Services database contains the contents of the directory named the same as the database. For example, the archive file for the Yourdata sample database stores the contents of the Yourdata directory. The default path of this directory is:
C:\Program Files\Microsoft Analysis Services\Data\yourdata
All Analysis Services databases can be found in the Data directory. The path leading to the Data directory can be changed. To determine the current path, in Analysis Manager, right-click the server that contains the database, click Properties, and then see the Data folder box.
Important The subdirectories of the Data directory store security files that control end users' access to objects on the Analysis server. These files are included in the archive files. For this reason, archive files must be secured against unauthorized access.
The archive file also stores meta data for the database and its objects. The appropriate records from the Analysis Services repository are included in the archive file. By default the Analysis Services repository is:
C:\Program Files\Microsoft Analysis Services\Bin\msmdrep.mdb
However, the path of the repository can be changed at installation, and the repository can be migrated to a SQL Server database.
Caution When you restore a database, its file set (in the Data directory and its subdirectories) and its meta data are returned to their states at the time the archive file was created. Files that were created since then in these directories are deleted. Changes and additions since then to Analysis Services repository records for the database and its objects are removed.
When you restore a database:
• The files in the selected archive file are restored to the appropriate directories in the Data directory. If these directories already contain files with the same names as files in the archive file, the existing files are overwritten by the files in the archive file. If these directories contain files with different names, including new files that were created since the archive file was created, these files are deleted.
• The records in the Analysis Services repository that are associated with the database and its objects are replaced by the Analysis Services repository records in the selected archive file.
Sometimes restoration of a database containing relational OLAP (ROLAP) partitions copies rather than replaces the original, archived partitions. When this happens, the original ROLAP partitions and their copies will use the same aggregation tables. This situation may cause overwrite conflicts in the tables. For example, if you change and process an original partition, the aggregation tables may no longer be valid for the copy of that partition. To avoid this problem, specify a unique aggregation prefix for each ROLAP partition immediately after restoration, and then process the partitions whose aggregation prefixes you changed. This action creates different aggregation tables for each partition.
Maintaining OLAP Data
Maintaining OLAP Data
The purpose of Microsoft® SQL Server™ 2000 Analysis Services is to provide rapid analytical access to data warehouse data. To accomplish this purpose, Analysis Services creates multidimensional cubes from data in the data warehouse fact and dimension tables. Numerical measures are also summarized into preaggregated values during cube construction. Cubes are stored in multidimensional structures that are designed for rapid query response, combining preaggregated information with raw fact data to respond to a wide variety of queries.
Cubes can contain data summarized, copied, or read directly from the data warehouse. Changes to the structure of the data warehouse or the data contained in it can affect the integrity and accuracy of cubes that have been created from the data warehouse. Because Analysis Services provides continuous online access to cubes, changes to the underlying data warehouse must be approached with a clear understanding of their effects on cubes and how to manage the synchronization of data in the data warehouse with data in cubes.
OLAP data must be updated after data warehouse data is changed. You process OLAP cubes, dimensions, and partitions to incorporate new or changed data from the data warehouse. The method of processing an OLAP object depends on the object and type of change made to the data warehouse, such as data addition, data change, or structural change.
Real-time OLAP is a feature that uses real-time cubes to automatically synchronize cube data with changes in the underlying relational database. Real-time cubes can be used for applications that need to monitor and analyze live data, and are intended to extend OLAP capabilities rather than replace traditional cube designs and applications.
Changes in the Data Warehouse
Data is usually added periodically to the data warehouse to include more recent information about the organization's business activities. Changes to data already in the data warehouse are less frequent and usually made only to incorporate corrections to errors discovered in the source from which the data was extracted, or to restructure data due to organizational changes. Structural changes to the data warehouse design typically are the least common.
Data Additions
It is common to add new data to the data warehouse. Cube information available online to client applications can be affected when data is added to the data warehouse due to interaction between the data and cube partitions. You can manage the effects of adding data to the data warehouse by carefully defining partition filters, and by designing a strategy to synchronize OLAP and data warehouse data.
Data Changes
Changes to correct errors in a data warehouse can be minimized by applying care during the data transformation, validation, and scrubbing operations. Other changes to existing data warehouse data can arise from changes in the structure of an organization or its products. For example, reorganizing products into different categories can require significant changes to data in the data warehouse, as well as to reports derived from the data warehouse. In some cases, such changes can require the complete redesign of cubes. In other cases, the redesign of dimensions and the processing of all cubes that use those dimensions may be all that is required.
Changes to correct errors in basic data should be incorporated in the source database, usually the OLTP business database, and then migrated to the data warehouse in a controlled manner. Many business OLTP database designs require changes to be made by a transaction that offsets the incorrect data and applies new correct data. It is often easier to manage the impact of such correction transactions on OLAP data. Cubes can incorporate new data transactions that correct value errors, such as an incorrect sale value. However, transactions that move a fact from one dimension member to another, such as a sale posted to the wrong customer, can affect the results of aggregate functions such as Avg. This is true for non-OLAP databases as well; if an original sale order is zeroed out but the record remains in the database, it will be included in the count of sales records and affect the calculation.
Depending on cube storage design, changes to data in the fact table can affect the accuracy of queries to a cube until the cube is processed. The Refresh data processing option can be used to reload the cube's data and recalculate the aggregations. Because aggregation design remains the same, the Refresh data processing option is faster than the complete Full process processing option.
Dimension hierarchies can be affected by changes to data in the data warehouse dimension tables even though the table schema remains the same. The dimension hierarchy is based on relationships between members in a dimension table. When these relationships are changed (for example, when cities are reorganized into different sales regions), the dimension structure must be rebuilt.
Referential integrity must be maintained when data warehouse data is added, changed, or deleted. Loss of referential integrity can result in errors during cube processing, fact table records being bypassed, or inaccurate OLAP information.
Structure Changes
The structure of OLAP cubes and dimensions can be affected by changes to the design of the data warehouse such as the addition, deletion, or alteration of tables, or relationships between tables. When the structure changes, you must modify the design of affected cubes and dimensions, redefine partitions and aggregations, and completely process the modified cubes and dimensions.
Synchronizing OLAP and Data Warehouse Data
Valid cubes are online and available to client applications at all times when the Analysis server is running. Because of the potential for interaction of OLAP cube partitions with data in the data warehouse, the design of the data warehouse should include a synchronization strategy to enable the addition of data without causing cubes to provide incorrect answers to queries in cubes available to online client applications.
One strategy for managing additions to data warehouse and OLAP data is to design a batch update system. In this strategy, all data in the data warehouse fact table includes a batch number in each record. When you design a cube, add an expression to the filter for each of the cube's partitions to specify the largest batch number applicable, for example, "... AND DWBatch <= 33 ..." When additions to the fact table need to be made, include a new, higher batch number in the new records. Cubes are unaffected by these added records because the cube partitions are restricted to reading data from previous batches only.
Data added to a dimension table does not affect existing cube private or shared dimensions until the dimensions are processed. A batch number in dimension table records is not necessary, but it can be useful in ensuring continued referential integrity.
Dimensions and cubes or partitions can be processed to incorporate new data after a batch of data has been added to the fact table and dimension tables. Shared dimensions should be processed before the cubes that use them. To add new members to a dimension that do not affect the dimension's structure, use the Incremental update option. To add new members and rebuild the dimension's structure, use the Rebuild the dimension structure option. Note that when a shared dimension is processed with the Rebuild the dimension structure option, all cubes that incorporate that dimension will immediately become unavailable to client applications and must be processed before they can be used again. However, when a shared dimension is processed using the Incremental update option, a cube that uses the shared dimension will display the new members, but the cells associated with those members will remain empty until the cube is updated with new data from the fact table that relates to the new members.
To incorporate a new data batch in a cube, update the filter expression in each of the cube's partitions to include the new batch number, and then process or incrementally update the cube. If a cube's data is divided among multiple partitions, you can use one of the partitions to accumulate new data batches and process that partition only. The cube's other partitions must have filters that exclude new data so that data will be added only to the accumulation partition.
Visibility to Client Applications
When a cube that is currently online is processed by any one of the three processing options (Full process, Incremental update, or Refresh data), the cube remains online until the processing has been completed, at which time the online cube is replaced by the new cube version. When a cube is processed using the Full process option, online client applications will be disconnected from the cube when the switch is made to the new version of the cube, and the client applications must individually reconnect to access the new version. When a cube is processed using either the Incremental update or the Refresh data option, online client applications will not be disconnected from the cube when the processing completes. The new version of the cube will be immediately visible with no break in service.
The processing of a shared dimension can affect cubes that incorporate the dimension in their design. If a shared dimension is processed using the Rebuild the dimension structure option, all cubes that use the dimension will immediately become unavailable to client applications and must be processed before they can be used again. If a shared dimension is processed using the Incremental update option, cubes that use the dimension remain available to client applications and any new members added to the dimension automatically become available to client applications when the dimension processing is complete. Any such new members will not have fact data associated with them until the cube is updated with new related facts.
The purpose of Microsoft® SQL Server™ 2000 Analysis Services is to provide rapid analytical access to data warehouse data. To accomplish this purpose, Analysis Services creates multidimensional cubes from data in the data warehouse fact and dimension tables. Numerical measures are also summarized into preaggregated values during cube construction. Cubes are stored in multidimensional structures that are designed for rapid query response, combining preaggregated information with raw fact data to respond to a wide variety of queries.
Cubes can contain data summarized, copied, or read directly from the data warehouse. Changes to the structure of the data warehouse or the data contained in it can affect the integrity and accuracy of cubes that have been created from the data warehouse. Because Analysis Services provides continuous online access to cubes, changes to the underlying data warehouse must be approached with a clear understanding of their effects on cubes and how to manage the synchronization of data in the data warehouse with data in cubes.
OLAP data must be updated after data warehouse data is changed. You process OLAP cubes, dimensions, and partitions to incorporate new or changed data from the data warehouse. The method of processing an OLAP object depends on the object and type of change made to the data warehouse, such as data addition, data change, or structural change.
Real-time OLAP is a feature that uses real-time cubes to automatically synchronize cube data with changes in the underlying relational database. Real-time cubes can be used for applications that need to monitor and analyze live data, and are intended to extend OLAP capabilities rather than replace traditional cube designs and applications.
Changes in the Data Warehouse
Data is usually added periodically to the data warehouse to include more recent information about the organization's business activities. Changes to data already in the data warehouse are less frequent and usually made only to incorporate corrections to errors discovered in the source from which the data was extracted, or to restructure data due to organizational changes. Structural changes to the data warehouse design typically are the least common.
Data Additions
It is common to add new data to the data warehouse. Cube information available online to client applications can be affected when data is added to the data warehouse due to interaction between the data and cube partitions. You can manage the effects of adding data to the data warehouse by carefully defining partition filters, and by designing a strategy to synchronize OLAP and data warehouse data.
Data Changes
Changes to correct errors in a data warehouse can be minimized by applying care during the data transformation, validation, and scrubbing operations. Other changes to existing data warehouse data can arise from changes in the structure of an organization or its products. For example, reorganizing products into different categories can require significant changes to data in the data warehouse, as well as to reports derived from the data warehouse. In some cases, such changes can require the complete redesign of cubes. In other cases, the redesign of dimensions and the processing of all cubes that use those dimensions may be all that is required.
Changes to correct errors in basic data should be incorporated in the source database, usually the OLTP business database, and then migrated to the data warehouse in a controlled manner. Many business OLTP database designs require changes to be made by a transaction that offsets the incorrect data and applies new correct data. It is often easier to manage the impact of such correction transactions on OLAP data. Cubes can incorporate new data transactions that correct value errors, such as an incorrect sale value. However, transactions that move a fact from one dimension member to another, such as a sale posted to the wrong customer, can affect the results of aggregate functions such as Avg. This is true for non-OLAP databases as well; if an original sale order is zeroed out but the record remains in the database, it will be included in the count of sales records and affect the calculation.
Depending on cube storage design, changes to data in the fact table can affect the accuracy of queries to a cube until the cube is processed. The Refresh data processing option can be used to reload the cube's data and recalculate the aggregations. Because aggregation design remains the same, the Refresh data processing option is faster than the complete Full process processing option.
Dimension hierarchies can be affected by changes to data in the data warehouse dimension tables even though the table schema remains the same. The dimension hierarchy is based on relationships between members in a dimension table. When these relationships are changed (for example, when cities are reorganized into different sales regions), the dimension structure must be rebuilt.
Referential integrity must be maintained when data warehouse data is added, changed, or deleted. Loss of referential integrity can result in errors during cube processing, fact table records being bypassed, or inaccurate OLAP information.
Structure Changes
The structure of OLAP cubes and dimensions can be affected by changes to the design of the data warehouse such as the addition, deletion, or alteration of tables, or relationships between tables. When the structure changes, you must modify the design of affected cubes and dimensions, redefine partitions and aggregations, and completely process the modified cubes and dimensions.
Synchronizing OLAP and Data Warehouse Data
Valid cubes are online and available to client applications at all times when the Analysis server is running. Because of the potential for interaction of OLAP cube partitions with data in the data warehouse, the design of the data warehouse should include a synchronization strategy to enable the addition of data without causing cubes to provide incorrect answers to queries in cubes available to online client applications.
One strategy for managing additions to data warehouse and OLAP data is to design a batch update system. In this strategy, all data in the data warehouse fact table includes a batch number in each record. When you design a cube, add an expression to the filter for each of the cube's partitions to specify the largest batch number applicable, for example, "... AND DWBatch <= 33 ..." When additions to the fact table need to be made, include a new, higher batch number in the new records. Cubes are unaffected by these added records because the cube partitions are restricted to reading data from previous batches only.
Data added to a dimension table does not affect existing cube private or shared dimensions until the dimensions are processed. A batch number in dimension table records is not necessary, but it can be useful in ensuring continued referential integrity.
Dimensions and cubes or partitions can be processed to incorporate new data after a batch of data has been added to the fact table and dimension tables. Shared dimensions should be processed before the cubes that use them. To add new members to a dimension that do not affect the dimension's structure, use the Incremental update option. To add new members and rebuild the dimension's structure, use the Rebuild the dimension structure option. Note that when a shared dimension is processed with the Rebuild the dimension structure option, all cubes that incorporate that dimension will immediately become unavailable to client applications and must be processed before they can be used again. However, when a shared dimension is processed using the Incremental update option, a cube that uses the shared dimension will display the new members, but the cells associated with those members will remain empty until the cube is updated with new data from the fact table that relates to the new members.
To incorporate a new data batch in a cube, update the filter expression in each of the cube's partitions to include the new batch number, and then process or incrementally update the cube. If a cube's data is divided among multiple partitions, you can use one of the partitions to accumulate new data batches and process that partition only. The cube's other partitions must have filters that exclude new data so that data will be added only to the accumulation partition.
Visibility to Client Applications
When a cube that is currently online is processed by any one of the three processing options (Full process, Incremental update, or Refresh data), the cube remains online until the processing has been completed, at which time the online cube is replaced by the new cube version. When a cube is processed using the Full process option, online client applications will be disconnected from the cube when the switch is made to the new version of the cube, and the client applications must individually reconnect to access the new version. When a cube is processed using either the Incremental update or the Refresh data option, online client applications will not be disconnected from the cube when the processing completes. The new version of the cube will be immediately visible with no break in service.
The processing of a shared dimension can affect cubes that incorporate the dimension in their design. If a shared dimension is processed using the Rebuild the dimension structure option, all cubes that use the dimension will immediately become unavailable to client applications and must be processed before they can be used again. If a shared dimension is processed using the Incremental update option, cubes that use the dimension remain available to client applications and any new members added to the dimension automatically become available to client applications when the dimension processing is complete. Any such new members will not have fact data associated with them until the cube is updated with new related facts.
Analysis Services Architecture
Analysis Services Architecture
Microsoft® SQL Server™ 2000 Analysis Services includes the Analysis server and PivotTable® Service. The Analysis server creates and manages multidimensional data cubes for online analytical processing (OLAP) and provides multidimensional data to PivotTable Service, which in turn provides this data to clients through Microsoft ActiveX® Data Objects (Multidimensional) (ADO MD) and OLE DB for OLAP provider services.
The server stores cube meta data (cube definition specifications) in a repository. Completed cubes can be stored in a variety of storage modes: as multidimensional database files (MOLAP), as tables in a relational database (ROLAP), or as a hybrid of multidimensional database files and relational tables (HOLAP).
Source data for multidimensional cubes resides in relational databases where the data has been transformed into a star or snowflake schema typically used in OLAP data warehouse systems. Analysis Services can work with many relational databases that support connections using ODBC or OLE DB. When used as part of SQL Server 2000, Analysis Services offers enhanced security and other capabilities. The Data Transformation Services (DTS) feature of SQL Server 2000 provides a means to manage the data warehouse from which Analysis Services creates cubes.
Control of the server is accomplished through the Analysis Manager user interface, or through custom applications developed using the Decision Support Objects (DSO) object model. DSO controls the creation and management of cubes by the server, and manages the cube meta data in the repository. The object model is used by the Analysis Manager program that provides the user interface through a snap-in to Microsoft Management Console (MMC). The DSO object model can be used by applications written in Microsoft Visual Basic® to provide custom programmatic control of the server. You can also develop custom applications to interact with the Analysis Manager user interface.
The following diagram illustrates the elements and functions of the Analysis server and its use of PivotTable Service to provide multidimensional data to client consumer applications. The Analysis Manager user interface uses PivotTable Service to obtain multidimensional data from the server for browsing by the server administrator. For more information about client applications.
Microsoft® SQL Server™ 2000 Analysis Services includes the Analysis server and PivotTable® Service. The Analysis server creates and manages multidimensional data cubes for online analytical processing (OLAP) and provides multidimensional data to PivotTable Service, which in turn provides this data to clients through Microsoft ActiveX® Data Objects (Multidimensional) (ADO MD) and OLE DB for OLAP provider services.
The server stores cube meta data (cube definition specifications) in a repository. Completed cubes can be stored in a variety of storage modes: as multidimensional database files (MOLAP), as tables in a relational database (ROLAP), or as a hybrid of multidimensional database files and relational tables (HOLAP).
Source data for multidimensional cubes resides in relational databases where the data has been transformed into a star or snowflake schema typically used in OLAP data warehouse systems. Analysis Services can work with many relational databases that support connections using ODBC or OLE DB. When used as part of SQL Server 2000, Analysis Services offers enhanced security and other capabilities. The Data Transformation Services (DTS) feature of SQL Server 2000 provides a means to manage the data warehouse from which Analysis Services creates cubes.
Control of the server is accomplished through the Analysis Manager user interface, or through custom applications developed using the Decision Support Objects (DSO) object model. DSO controls the creation and management of cubes by the server, and manages the cube meta data in the repository. The object model is used by the Analysis Manager program that provides the user interface through a snap-in to Microsoft Management Console (MMC). The DSO object model can be used by applications written in Microsoft Visual Basic® to provide custom programmatic control of the server. You can also develop custom applications to interact with the Analysis Manager user interface.
The following diagram illustrates the elements and functions of the Analysis server and its use of PivotTable Service to provide multidimensional data to client consumer applications. The Analysis Manager user interface uses PivotTable Service to obtain multidimensional data from the server for browsing by the server administrator. For more information about client applications.
sql server Programming Analysis Services Applications
Programming Analysis Services Applications
Microsoft® SQL Server™ 2000 Analysis Services provides support for you to create and integrate custom applications that enhance your online analytical processing (OLAP) and data mining installation.
Analysis Services includes the Analysis server and PivotTable® Service. The Analysis server manages and stores multidimensional information and serves client application requests for OLAP data. PivotTable Service is an OLE DB for OLAP provider that connects client applications to the Analysis server and manages offline cubes. A repository of meta data contains definitions of OLAP data objects such as cubes and their elements.
An object model, Decision Support Objects (DSO), provides support for the Analysis Manager user interface and for custom applications that manage OLAP meta data and control the server. An interface, IOlapAddIn, enables your applications to extend and interact with the user interface. PivotTable Service provides access to OLAP data from the server and the ability to create local cubes.
You can create applications that:
Manage the Analysis server and create and maintain OLAP and data mining objects such as cubes, dimensions, security roles, and data mining models.
Extend the user interface by adding new objects to the object tree pane and by adding and responding to new menu choices.
Connect to the Analysis server, query data in cubes, and create local cubes.
Combine any or all of these functions.
This section contains the following topics.
Topic Description
Analysis Services Architecture Information about the architecture of Analysis Services and its components
Analysis Services Component Tools Information about the tools in Analysis Services that you can use to create administrative support applications and client data access applications
SQL in Analysis Services Details of the implementation of SQL in Analysis Services
Decision Support Objects Information about the Analysis Services server object model, the component tool for managing OLAP, and data mining objects
Add-ins Information about the IOlapAddIn interface you can use in your applications to interact with the Analysis Manager user interface
PivotTable Service Information about the client application service you can use with applications that query OLAP and data mining data and create local cubes and mining models
Analysis Services Programming Samples Information about the samples that illustrate development of applications for Analysis Services
Microsoft® SQL Server™ 2000 Analysis Services provides support for you to create and integrate custom applications that enhance your online analytical processing (OLAP) and data mining installation.
Analysis Services includes the Analysis server and PivotTable® Service. The Analysis server manages and stores multidimensional information and serves client application requests for OLAP data. PivotTable Service is an OLE DB for OLAP provider that connects client applications to the Analysis server and manages offline cubes. A repository of meta data contains definitions of OLAP data objects such as cubes and their elements.
An object model, Decision Support Objects (DSO), provides support for the Analysis Manager user interface and for custom applications that manage OLAP meta data and control the server. An interface, IOlapAddIn, enables your applications to extend and interact with the user interface. PivotTable Service provides access to OLAP data from the server and the ability to create local cubes.
You can create applications that:
Manage the Analysis server and create and maintain OLAP and data mining objects such as cubes, dimensions, security roles, and data mining models.
Extend the user interface by adding new objects to the object tree pane and by adding and responding to new menu choices.
Connect to the Analysis server, query data in cubes, and create local cubes.
Combine any or all of these functions.
This section contains the following topics.
Topic Description
Analysis Services Architecture Information about the architecture of Analysis Services and its components
Analysis Services Component Tools Information about the tools in Analysis Services that you can use to create administrative support applications and client data access applications
SQL in Analysis Services Details of the implementation of SQL in Analysis Services
Decision Support Objects Information about the Analysis Services server object model, the component tool for managing OLAP, and data mining objects
Add-ins Information about the IOlapAddIn interface you can use in your applications to interact with the Analysis Manager user interface
PivotTable Service Information about the client application service you can use with applications that query OLAP and data mining data and create local cubes and mining models
Analysis Services Programming Samples Information about the samples that illustrate development of applications for Analysis Services
SQL SERVER OLAP and Data Warehouses
OLAP and Data Warehouses
OLAP provides a multidimensional presentation of data warehouse data, creating cubes that organize and summarize data for efficient analytical querying. The design of the data warehouse structure can affect how easily these cubes can be designed and constructed.
Microsoft® SQL Server™ 2000 Analysis Services relies on the data provided by the data warehouse to be accurate, stable, and to have referential integrity. When creating a data warehouse for use with Analysis Services, these design factors should be considered:
Use a star schema if possible.
If a snowflake schema is needed, minimize the number of dimension tables beyond the first level from the fact table.
Design dimension tables for the users.
Dimension tables should include meaningful information about the facts that users will want to explore, such as the color or size of a product.
Apply commonsense normalization to dimension table design.
Unrelated data should not be combined into a single dimension table, and data should not be repeated in multiple dimension tables. For example, create a separate customer dimension instead of repeating customer information in more than one dimension table.
Do not over-summarize in the fact table.
Retain the finest level of granularity users need to access, and keep all fact table records at the same level of detail. Analysis Services is designed to create and manage summary data from highly granular data warehouses without penalizing users in query response time.
Use a common fact table structure for similar data.
Data intended to be used in the same cube can be stored in multiple fact tables, but those tables must have the same structure.
Do not create auxiliary tables of summarized data.
Analysis Services precalculates summaries into structures that are designed for query efficiency. Other auxiliary summarization tables are not used.
Create indexes on key fields.
For each dimension table, create an index on its key column. For each fact table, create a single index on the combination of columns that contain the foreign keys of the dimension tables associated with the fact table. Analysis Services uses these indexes when it loads multidimensional data structures and calculates summary data. These indexes significantly improve cube processing performance.
Ensure referential integrity.
It is important that all facts be represented in all dimension tables. Facts in a fact table that do not have a corresponding key in a dimension table can cause errors or fact table rows to be ignored if the fact and dimension tables are used in the same cube.
Design a data update strategy.
When data is added to or changed in the data warehouse, cubes that have been built from previous data must be updated before the new data is available to users. Incorporating additional data into cubes requires less time than rebuilding cubes when existing data changes.
OLAP provides a multidimensional presentation of data warehouse data, creating cubes that organize and summarize data for efficient analytical querying. The design of the data warehouse structure can affect how easily these cubes can be designed and constructed.
Microsoft® SQL Server™ 2000 Analysis Services relies on the data provided by the data warehouse to be accurate, stable, and to have referential integrity. When creating a data warehouse for use with Analysis Services, these design factors should be considered:
Use a star schema if possible.
If a snowflake schema is needed, minimize the number of dimension tables beyond the first level from the fact table.
Design dimension tables for the users.
Dimension tables should include meaningful information about the facts that users will want to explore, such as the color or size of a product.
Apply commonsense normalization to dimension table design.
Unrelated data should not be combined into a single dimension table, and data should not be repeated in multiple dimension tables. For example, create a separate customer dimension instead of repeating customer information in more than one dimension table.
Do not over-summarize in the fact table.
Retain the finest level of granularity users need to access, and keep all fact table records at the same level of detail. Analysis Services is designed to create and manage summary data from highly granular data warehouses without penalizing users in query response time.
Use a common fact table structure for similar data.
Data intended to be used in the same cube can be stored in multiple fact tables, but those tables must have the same structure.
Do not create auxiliary tables of summarized data.
Analysis Services precalculates summaries into structures that are designed for query efficiency. Other auxiliary summarization tables are not used.
Create indexes on key fields.
For each dimension table, create an index on its key column. For each fact table, create a single index on the combination of columns that contain the foreign keys of the dimension tables associated with the fact table. Analysis Services uses these indexes when it loads multidimensional data structures and calculates summary data. These indexes significantly improve cube processing performance.
Ensure referential integrity.
It is important that all facts be represented in all dimension tables. Facts in a fact table that do not have a corresponding key in a dimension table can cause errors or fact table rows to be ignored if the fact and dimension tables are used in the same cube.
Design a data update strategy.
When data is added to or changed in the data warehouse, cubes that have been built from previous data must be updated before the new data is available to users. Incorporating additional data into cubes requires less time than rebuilding cubes when existing data changes.
SQL SERVER ABOUT DATA WAREHOUSES
About Data Warehouses
A data warehouse is often used as the basis for a decision support system. Data warehouses are designed to overcome problems encountered when an organization attempts to perform strategic analysis using the same database that is used for online transaction processing (OLTP).
OLTP systems typically:
Support large numbers of concurrent users who are actively adding and modifying data.
Represent the constantly changing state of an organization but don't save its history.
Contain large amounts of data, including extensive data used to verify transactions.
Have complex structures.
Are tuned to be responsive to transaction activity.
Provide the technology infrastructure to support the day-to-day operations of an organization.
Difficulties often encountered when OLTP databases are used for online analysis include the following:
Analysts do not have the technical expertise required to create ad hoc queries against the complex data structure.
Analytical queries that summarize large volumes of data adversely affect the ability of the system to respond to online transactions.
System performance when responding to complex analysis queries can be slow or unpredictable, providing inadequate support to online analytical users.
Constantly changing data interferes with the consistency of analytical information.
Security becomes more complicated when online analysis is combined with online transaction processing.
Data warehousing provides one of the keys to solving these problems, by organizing data for the purpose of analysis. Data warehouses:
Can combine data from heterogeneous data sources into a single homogenous structure.
Organize data in simplified structures for efficiency of analytical queries rather than for transaction processing.
Contain transformed data that is valid, consistent, consolidated, and formatted for analysis.
Provide stable data that represents business history.
Are updated periodically with additional data rather than frequent transactions.
Simplify security requirements.
Provide a database organized for OLAP rather than OLTP.
A data mart is a special form of data warehouse, typically containing a topic-oriented subset of enterprise data appropriate to a specific business function.
Microsoft® SQL Server™ 2000 provides many essential tools for building data warehouses and data marts, including Data Transformation Services (DTS).
A data warehouse is often used as the basis for a decision support system. Data warehouses are designed to overcome problems encountered when an organization attempts to perform strategic analysis using the same database that is used for online transaction processing (OLTP).
OLTP systems typically:
Support large numbers of concurrent users who are actively adding and modifying data.
Represent the constantly changing state of an organization but don't save its history.
Contain large amounts of data, including extensive data used to verify transactions.
Have complex structures.
Are tuned to be responsive to transaction activity.
Provide the technology infrastructure to support the day-to-day operations of an organization.
Difficulties often encountered when OLTP databases are used for online analysis include the following:
Analysts do not have the technical expertise required to create ad hoc queries against the complex data structure.
Analytical queries that summarize large volumes of data adversely affect the ability of the system to respond to online transactions.
System performance when responding to complex analysis queries can be slow or unpredictable, providing inadequate support to online analytical users.
Constantly changing data interferes with the consistency of analytical information.
Security becomes more complicated when online analysis is combined with online transaction processing.
Data warehousing provides one of the keys to solving these problems, by organizing data for the purpose of analysis. Data warehouses:
Can combine data from heterogeneous data sources into a single homogenous structure.
Organize data in simplified structures for efficiency of analytical queries rather than for transaction processing.
Contain transformed data that is valid, consistent, consolidated, and formatted for analysis.
Provide stable data that represents business history.
Are updated periodically with additional data rather than frequent transactions.
Simplify security requirements.
Provide a database organized for OLAP rather than OLTP.
A data mart is a special form of data warehouse, typically containing a topic-oriented subset of enterprise data appropriate to a specific business function.
Microsoft® SQL Server™ 2000 provides many essential tools for building data warehouses and data marts, including Data Transformation Services (DTS).
SQL SERVER Data Warehousing and OLAP
Although sometimes used interchangeably, the terms data warehousing and online analytical processing (OLAP) apply to different components of systems often referred to as decision support systems or business intelligence systems. Components of these types of systems include databases and applications that provide the tools analysts need to support organizational decision-making.
A data warehouse is a database containing data that usually represents the business history of an organization. This historical data is used for analysis that supports business decisions at many levels, from strategic planning to performance evaluation of a discrete organizational unit. Data in a data warehouse is organized to support analysis rather than to process real-time transactions as in online transaction processing systems (OLTP).
OLAP technology enables data warehouses to be used effectively for online analysis, providing rapid responses to iterative complex analytical queries. OLAP's multidimensional data model and data aggregation techniques organize and summarize large amounts of data so it can be evaluated quickly using online analysis and graphical tools. The answer to a query into historical data often leads to subsequent queries as the analyst searches for answers or explores possibilities. OLAP systems provide the speed and flexibility to support the analyst in real time.
Subscribe to:
Posts (Atom)