INTERVIEW Q&A
Most of the questions here are, which I was asked in my interviews!
What is the Difference between Clustered Index and Non-clustered Index in SQL Server?
Even this is basics in SQL server; I hear this question from most of the interviewers
Clustered Index: When you create a clustered Index, the actual leaf level pages of the Index contain the data of the base table and sorted by the column on which the index is created know as the clustered key. There can be only one clustered index per table as the data is sorted based on that column. Although multiple columns can be added to the Index also called as composite Index
Non-Clustered Index: Whereas when you create a non-clustered index on a column, the leaf level pages does not contain the actual data of the base table but a pointer directed to the data records in the clustered key if you have clustered index created already or to the record of the base table in case of heap (No clustered Index) you can also add or include additional columns into non-clustered indexes as same in clustered Indexes. We can have up to 249 non-clustered indexes in SQL server 2005 and 999 in later versions per table.
What is the difference between ‘TRUNCATE’ and ‘DELETE’?
Truncate and delete both are used to delete the rows from a SQL Server Table. There is myth that so many DBA’s believe that Truncate is non logged operation but whereas DELETE is a logged operation but this is fully not true.
Truncate is also a logged operation but logs only the de-allocations of the deleted rows from the table but Delete logs all the deleted rows in to the log file.
Another major difference is you cannot use ‘where’ clause with TRUNCATE statement but ‘where’ clause can be used with DELETE
EX: Trunacte table [Dbo].[Employee]
EX: DELETE from [Dbo].[Employee] where Name like ‘Microsoft’
Which is faster to delete the entire table data? Obviously TRUNCATE whereas Delete has log each and every row.
Note: Both Truncate and Delete deletes only data not the structure of the table.
What is normalization and Normalization forms?
The process of organizing data to minimize redundancy is called normalization. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.
There are 7 Normal forms defined in the real world but not really 6 Normal forms are implemented but only 3 normal forms are generally implemented in most cases. Any way we should beware of all of the normal forms in point of interview even though we don’t use them all because the client may use them..ha.
First Normal Form (1 NF)
Second Normal Form (2 NF)
Third Normal Form (3 NF)
Boyce-Codd Normal Form (BCNF)
Fourth Normal Form (4 NF)
Fifth Normal Form( 5 NF)
Sixth Normal Form(6 NF)
The best place I have seen them explain is here. Knowing at least 3 NF in depth is needed for devlopers and DBA’s but not to Architects.
What is de-normalization?
De-normalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data. De-normalization should only take place after a satisfactory level of normalization has taken place. De-normalization techniques are often used to improve the scalability of Web applications.
De-normalization is often used to improve the performance of the database if there is any impact of Normalization on the database.
For Example:
You have two tables in referential integrity and while loading the data to tables might be any issue (depends on scenario) in that times we disable the referential integrity which is type of de- normalizing
What is difference between the stored procedure and function?
They are completely different in both the structure they will be created and the purpose too but even though I often hear this question from the interviewers. The only similarity is they can be re-usable multiple times in your development now and in future and also both takes parameters .
The function should return some value whereas stored procedure may not return a value.
Stored procedures will be executed independently by using EXEC store_proc whereas function should be called with in a SQL Query suppose say select serverproperty(‘productlevel’)
What’s the difference between a primary key and a unique key?
Even though both the primary key and unique key are one or more columns that can uniquely identify a row in a table, they have some important differences. Most importantly, a table can have only a single primary key while it can have more than one unique key. Primary key can be considered as a special case of the unique key. Another difference is that primary keys have an implicit NOT NULL constraint while the unique key does not have that constraint. Therefore, unique key columns may or may not contain NULL values but primary key columns cannot contain NULL values.
List some of the new features in SQL server 2008? (Version Related)
Backup Compression
Page compression
Row compression
Hot add CPU
Resource Governor
Policy Based Management
Data Collector
Data-tier applications
Change data Capture (CDC)
What is new release of SQL Server yet to come?
Code Named “Denali”, referring as SQL server 2011 as of today DENALI (CTP3) is released.
What is RAID and what are different types of RAID configurations?(Storage Related)
RAID stands for Redundant Array of Independent Disks. RAID 0, RAID 1, RAID 2, RAID 5 and RAID 10 are most popular RAID levels used for relational databases storage but there are other RAID levels too. To better understand the RAID level architecture and RAID level functionality follow this link
Why we should have High Availability?
The main point of a high-availability strategy is to keep the critical data as available as possible in the event of a failure or disaster. Failures, by their very nature, are not planned events and so it is hard to predict when one will occur and what form it will take. There are also times when planned events may occur that affect data availability if preventative steps have not been taken.
Difference between High Availability and Disaster Recovery?
Disaster Recovery is the process to restore operations critical to the resumption of business after a natural or human-induced disaster. High availability is to continue operations when a component fails. This is usually a hardware component like a CPU, power supply, Disk failure, Memory failure or the complete server. With DR there is usually some short loss of service while the DR systems and databases are activated and the operation is switched over to the DR site. With HA there is usually no loss of service when a component fails as it is immediately fail over to another serve without any downtime for the users in some cases.
Why does the log file grow? Explain the scenarios?
There can be many reasons for the log file to grow; some of them can be below,
Choosing improper recovery model for the database
Running Huge Bulk Loads and not having an appropriate recovery model
Huge inserts on highly indexed tables
Having database in FULL recovery model and failed to do log backup
Lack of good backup strategy.
Here is link for an excellent blog post
What steps you follow in tuning a long running stored procedure?
Checking the resources (CPU, Memory, IO) utilized by the stored procedure execution.
See if you can better re-code it to get the same results( Ex: Union all instead union)
Checking the execution plan to find the high cost operation in the process of execution
Finding any missing indexes on tables
Checking if there is a need for any Query Hints likes Maxdop (1), with recompile.
Have you ever worked on Clustering? If yes then how does the failover works in clustering?
Yes, the clustered nodes use the heartbeat to check whether each node is alive, at both the operating system and SQL Server level.
At the operating system level, the primary node reserves the resource every 3 seconds, and the competing node every 5 seconds. The process lasts for 25 seconds and then starts over again. For example, if the node owning the instance fails due to a problem (network, disk, and so on), at second 19, the competing node detects it at the 20-second mark, and if it is determined that the primary node no longer has control, the competing node takes over the resource.
At SQL Server Level, the node hosting the SQL Server resource does a looks-alive check every 5 seconds. This is a lightweight check to see whether the service is running and may succeed even if the instance of SQL Server is not operational. The IsAlive check is more thorough and involves running a SELECT @@SERVERNAME Transact SQL query against the server to determine whether the server itself is available to respond to requests; it does not guarantee that the user databases are up. If this query fails, the IsAlive check retried five times and then attempts to reconnect to the instance of SQL Server. If all five retries fail, the SQL Server resource fails. Depending on the failover threshold configuration of the SQL Server resource, Windows Clustering will attempt to either restart the resource on the same node or fail over to another available node.
During the fail over from one node to another, Windows clustering starts the SQL Server service for that instance on the new node, and goes through the recovery process to start the databases. The fail over of the SQL Server virtual server will take a short time (probably seconds). After the service is started and the master database is online, the SQL Server resource is considered to be up. Now the user databases will go through the normal recovery process, which means that any completed transactions in the transaction log are rolled forward, and any incomplete transactions are rolled back. The length of the recovery process depends on how much activity must be rolled forward or rolled back upon startup. Set the recovery interval of the server to a low number to avoid long recovery times and to speed up the failover process.
What is /3GB switch, PAE and AWE related to memory management and what they do?
Here is the best article I found and read /3gb , PAE and AWE
How to recover a database from ‘Suspect Mode’ in SQL Server?
Step 1: ALTER DATABASE My dB SET EMERGENCY
Step 2: ALTER DATABASE My dB SET SINGLE_USER WITH ROLLBACK IMMEDIATE
Step 3: DBCC CHECKDB (‘MY DB’, REPAIR_ALLOW_DATA_LOSS)
Step 4: ALTER DATABASE MY DB SET MULTI_USER
How to copy backup files from one server to another server?
This can be done in multiple ways,
Method 1: xp_cmdshell ‘Robocopy <source folder> < destination folder> <file to copy>’
Method 2: Creating an SSIS Package using File Transfer Package.
Method 3: Copying using shared folder or Network Copy
Method 4: Other third Party Tools, X-copy utility and other.
How to start and stop SQL Server from CMD?
I’m not sure why the interviewer asks this question but they do ask this :D
Step 1: To Start, Open CMD and Type ‘NET START MSSQLSERVER’
Step 2: To Stop, ‘NET STOP MSSQLSERVER’
How to connect to SQL Server instance Using CMD?
Step 1: Open CMD, Type: sqlcmd –S ‘instance name’ , Check for more information: BOL
How to Restore Master Database?
Step 1: Re-Start the Server in Single-User mode using –m startup Parameter, Check BOL
Step 2: Run à ‘RESTORE DATABASE master FROMWITH REPLACE’, this will end up in shutting down the server. Recommended to do this step using SQLCMD
Step 3: Now, Restart the server normal by taking off the –m parameter.
How to failover database in log shipping?
Step 1: Remove secondary database from Log shipping
Step 2: Copy over any recent transactional log backups made on primary that are not copied to the secondary shared folder yet.
Step 3: Restore all transaction log backups to secondary database; If possible when primary database is not damaged and you should take Tail-log backup with leaving the database in restoring mode this would revoke access to users.
Step 4: Restore Tail log on to secondary database and the databases are in Sync Now.
Step 5: Now, Redirect the users to the secondary database.
Accidentally deleted a table, how to recover it with data?
There are 2 ways to do this, depends on our situation. In both ways you need a Full Backup and consecutive differential and log backups till time.
1. Restore the Full Backup on your current database with norecovery and then Differential with norecovery and series of log backups with recovery only for the last log file to bring it online.
2. Restore the Full Backup to New Database with different name and then do the same process as in step 1 and copy over the table to the actual dB that you deleted , this 2nd way will not hurt or disturb your current users unless they were using this table till it get copied over.
No Backups ! You are screwed. you will know the importance of the backup only when it needed.