SQL for Data Storage

Data Store is a repository for storing and managing collections of data. It includes database, files, emails, etc. Database is a series of bytes and it is managed by Database Management System (DBMS). A file is a series of bytes that is managed by a file system. Therefore, data store is a series of bytes that is once stored.

Microsoft SQL Server is a Relational Database Management System (RDBMS) that stores data in tables. Table is a database object and it contains data. Data is stored in tables in rows and columns. These database tables are stored in a schema. We can create up to 2,147,483,647 tables in a database and each table can have up to 1024 columns. These SQL server tables are stored in database in the form of pages of 8 KB each.

SQL Server Database are stored in disk and has at least 2 operating system files:

1>     Data File – These files are divided into 8 KB pages. Data Files are of two types:

  • Primary Data Files: Every Database has 1 primary data file. Database startup information is stored in this file. This file also contains user data and objects. These files have an extension of .mdf.
  • Secondary Data Files: It is used to spread data across multiple files/disks. Each file resides on a different disk drive. These files have an extension of .ndf.

2>     Log File – These files have an extension of .ldf.

SQL Server database can have multiple data and log files but it can have only one primary data file. SQL Server tables has pages as the fundamental unit of data storage. Storage space that is allocated to database is divided into sequentially numbered pages from 0 to n. Each page is 8 KB in size. A row in a database table is limited to 8 KB size and cannot exceed more than one page. If the data exceeds 8 KB and row contains varchar or varbinary data, then the data in those columns are moved to a new page. Disk I/O operations are performed at page level. SQL Server reads and writes whole data pages.

The data pages for the SQL Server table can be organized within each partition in two ways: heap or B-Tree Clustered tables.

1>     Heap table: In heap table, the rows are not sorted in any particular order within each data page. The available space is checked and data rows are inserted to the first available location. If there are no space available, then additional pages will be added to the table and then rows will be inserted into these new pages. We can sort the data that is fetched from heap table in ascending or descending order using ORDER by clause.

2>     Clustered Table: This table has predefined clustered index on one or more columns of the table. Based on this clustered indexed column, rows are sorted within the data pages. Clustered index is automatically created on the Primary Key column. As the table rows can be sorted in only one order so, there can be only one clustered index on each table. The clustered index is built using the B-tree structure.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s