Data Base
Table of Contents

What is DDL, DML, DCL, TCL in SQL Server?

The Data Definition Language (DDL) includes,

CREATE TABLE - creates new database table
ALTER TABLE - alters or changes the database table
DROP TABLE - deletes the database table
CREATE INDEX - creates an index or used as a search key
DROP INDEX - deletes an index

The Data Manipulation Language (DML) includes,

SELECT - extracts data from the database
UPDATE - updates data in the database
DELETE - deletes data from the database
INSERT INTO - inserts new data into the database

The Data Control Language (DCL) includes,

GRANT – gives access privileges to users for database
REVOKE – withdraws access privileges to users for database

The Transaction Control (TCL) includes,

COMMIT – saves the work done
ROLLBACK - restore the database to original since the last COMMIT

Define candidate key, alternate key, composite key?

primary key:- The attribute or combination of attributes
that uniquely identifies a row or record.

Foreign Key:- an attribute or combination of attribute in a
table whose value match a primary key in another table.

Composite key:- A primary key that consistsof two or more
attributes is known as composite key

candidate key:- is a column in a table which has the
ability to become a primary key.

Alternate Key:- Any of the candidate keys that is not part
of the primary key is called an alternate key.

having vs where

Answer in one line is : HAVING specifies a search condition for a group or an aggregate function used in SELECT statement.

SELECT titles.pub_id, AVG(titles.price)
FROM titles INNER JOIN publishers
ON titles.pub_id = publishers.pub_id
WHERE publishers.state = ‘CA’
GROUP BY titles.pub_id
HAVING AVG(titles.price) > 10

What structure can you have the database make to speed up table reads?

Follow the rules of DB tuning we have to:

  • Properly use indexes ( different types of indexes)
  • properly locate different DB objects across different tablespaces, files and so on
  • create a special space (tablespace) to locate some of the data with special datatype ( for example CLOB, LOB and …)

What are the tradeoffs with having indexes?

  • Faster selects, slower updates.
  • Extra storage space to store indexes. Updates are slower because in

addition to updating the table you have to update the index.

What is "normalization"? "Denormalization"? Why do you sometimes want

to denormalize?
Normalizing data means eliminating redundant information from a table and organizing the data so that future changes to the table are easier.

Denormalization means allowing redundancy in a table. The main benefit of denormalization is
improved performance with simplified data retrieval and manipulation. This is done by reduction in the number of joins needed for data processing.

What is "index covering" of a query?

Index covering means that "Data can be found only using indexes, without touching the tables"

What is the difference between an outer join and an inner join? left join , right join

You use INNER JOIN to return all rows from both tables where there is a match. ie. in the resulting table all the rows and colums will have values.

In OUTER JOIN the relulting table may have empty colums. Outer join may be either LEFT or RIGHT

LEFT OUTER JOIN returns all the rows from the first table, even if there are no matches in the second table.

RIGHT OUTER JOIN returns all the rows from the second table, even if there are no matches in the first table.

How to check if there is at least one record in the table that satisfies a given condition? what is a large table

What is 2-phased commit?

What is a hash-join? Nested loop join? Merge join?

Sort merge join is used to join two independent data sources. They perform better than nested loop when the volume of data is big in tables but not as good as hash joins in general.

They perform better than hash join when the join condition columns are already sorted or there is no sorting required.

What are the properties of a DB transaction?


* Atomicity is an all-or-none proposition.
* Consistency - the database should be in consistent state between multiple states in transaction.
* Isolation - no other queries can access the data modified by a running transaction.
* Durability guarantees that the database will keep track of pending changes in such a way that the server can recover from an abnormal termination.

* Atomicity. In a transaction involving two or more discrete pieces of information, either all of the pieces are committed or none are.
* Consistency. A transaction either creates a new and valid state of data, or, if any failure occurs, returns all data to its state before the transaction was started.
* Isolation. A transaction in process and not yet committed must remain isolated from any other transaction.
* Durability. Committed data is saved by the system such that, even in the event of a failure and system restart, the data is available in its correct state.

What’s the difference between CHAR_LENGTH and LENGTH?

The first is, naturally, the character count. The second is byte count. For the Latin characters the numbers are the same, but they’re not the same for Unicode and other encodings.

What’s the default port for MySQL Server?


What happens when the column is set to AUTO INCREMENT and you reach

the maximum value for that table?
It stops incrementing. It does not overflow to 0 to prevent data losses, but
further inserts are going to produce an error, since the key has been used already

If you specify the data type as DECIMAL (5,2), what’s the range of values

that can go in this table?
999.99 to -99.99. Note that with the negative number the minus sign is considered one of the digits.

What is table partition?

differences between droppin, truncating , delete

Dropping : (Table structure + Data are deleted), Invalidates the dependent objects ,Drops the indexes
Truncating: (Data alone deleted), Performs an automatic commit, Faster than delete

Delete : (Data alone deleted), Doesn’t perform automatic commit

what is union or join

union statement will combine the result-sets of multiple queries. each query in a union statement must have the same number of columns.

A UNION ALL is taking two of the exactly same structured tables and making them into one table. For a lack of a better example you had two tables: tblCustomersCompanyA and tblCustomersCompanyB that had the same columns, you could run a UNION ALL query to make them appear as one table.

what's the difference between order by and group by

Can use Agregate functions with GROUP BY. such as (sum, avg etc) with group by

how to speed up a SQL query

what's normalization

Normalisation and denormalisation are exact opposite to each other.

Normalization is done to remove redundancy(ie to remove duplicates)

The steps involved are:
1 NF (to bring data to atomic level)
2 NF (To define primary-foreign relationship)
3 NF (To remove transive dependency)

how to transform data between different sources in SQL Server and Oracle

Difference between nested query,sub query and nested query?

Correlated subquery runs once for each row selected by the outer query. It contains a reference to a value from the row selected by the outer query.

For example,

select e1.empname, e1.basicsal, e1.deptno from emp e1 where e1.basicsal = (select max(basicsal) from emp e2 where e2.deptno = e1.deptno)

Nested subquery runs only once for the entire nesting (outer) query. It does not contain any reference to the outer query row.
For example,

select empname, basicsal, deptno from emp where (deptno, basicsal) in (select deptno, max(basicsal) from emp group by deptno)

When you use many to many relationship

What Are The Six Normalization Forms In SQL(Standard Query Language)?

Normal forms in SQL are given below.

First Normal form (1NF): A relation is said to be in 1NF if it has only single valued attributes, neither repeating nor arrays are permitted.

Second Normal Form (2NF): A relation is said to be in 2NF if it is in 1NF and every non key attribute is fully functional dependent on the primary key.

Third Normal Form (3NF): We say that a relation is in 3NF if it is in 2NF and has no transitive dependencies.

Boyce-Codd Normal Form (BCNF): A relation is said to be in BCNF if and only if every determinant in the relation is a candidate key.

Fourth Normal Form (4NF): A relation is said to be in 4NF if it is in BCNF and contains no multi valued attributes.

Fifth Normal Form (5NF): A relation is said to be in 5NF if and only if every join dependency in relation is implied by the candidate keys of relation.

Domain-Key Normal Form (DKNF): We say that a relation is in DKNF if it is free of all modification anomalies. Insertion, Deletion, and update anomalies come under modification anomalies.

What is a "constraint"?


  A constraint allows you to apply simple referential integrity checks to a table. There are 5 primary types of constraints that are currently supported by SQL Server:

PRIMARY/UNIQUE - enforces uniqueness of a particular table column.
DEFAULT - specifies a default value for a column in case an insert operation does not provide one.
FOREIGN KEY - validates that every value in a column exists in a column of another table.
CHECK - checks that every value stored in a column is in some specified list
NOT NULL - is a constraint which does not allow values in the specific column to be null. And also it is the only constraint which is not a table level constraint.

What is the query used to display all tables names in SQL Server (Query analyzer)?

select * from information_schema.tables

What are the large object types supported by Oracle?

Blob and Clob

What's the difference between a primary key and a unique key?

Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a nonclustered index by default.

Another major difference is that,primary key doesn't allow NULLs, but unique key allows one NULL only.


Synonyms is the alias name for table, views, sequences & procedures and are created for reasons of Security and Convenience. Two levels are Public - created by DBA & accessible to all the users. Private - Accessible to creator only. Advantages are referencing without specifying the owner and Flexibility to customize a more
meaningful naming convention.


sample query


  • simple subquery
SELECT empnum, name
FROM employee as e1
WHERE salary > 
(SELECT avg(salary)
   FROM employee
   WHERE department = 'abc'

In the above example, the inner query is first executed, and the result is then fed into the outer query.

  • correlated subquery

In this example we are finding the list of employees (employee number and names) having more salary than the average salary of all employees in that employee's department.

SELECT empnum, name
FROM employee as e1
WHERE salary > 
(SELECT avg(salary)
   FROM employee
   WHERE department = e1.department);

the inner query has to be executed for every employee as the department will change for every row. Hence the average salary will also change.

  • for correlated subquery: join is faster

you will typically see that the left join will perform better, in this type of case, because of the cardinality of the employee table. You are selecting all rows from Employee, which is 1,000,000. The correlated subquery has to be evaluated for every row returned by the outer query, which is select (…) from employee, with no where clause. This means that the department subquery has to be evaluated 1,000,000 times. On the other hand, the left join should perform better because it does not have to do a correlated evaluation, on a row-by-row basis.

  • for simple subquery, subquery is faster than join

The difference between nested-loops for joins and nested-loops for subqueries is that a subquery needs to find only a single match, so it might be able to break out early from the inner loop. In contrast, a join needs all matches, so it must chug through all rows.

One major point in favor of writing joins instead of subqueries is that it's better not to tell the DBMS what to do. By writing joins, you're giving the DBMS choices instead of forcing it to follow a specific plan.

flatten query

Subquery with IN:

  WHERE Table1.column1 IN
     (SELECT Table2.column1
        FROM Table2
        WHERE Table2.column1 = 5)


SELECT Table1.*
  FROM Table1, Table2
  WHERE Table1.column1 = Table2.column1
    AND Table2.column1 = 5

This simple flattening seems like it should be equivalent to the IN query, but it has a flaw that can be seen if
- Table1 has these three rows: {1, 5, 5} ;
- Table2 has these three rows: {2, 5, 5}.

Notice that the subquery-based SELECT would correctly return two rows: {5, 5}, but the flattened SELECT would return four rows: {5, 5, 5, 5}.
Notice further that if the flattened SELECT is "corrected" by using SELECT DISTINCT instead of just SELECT, the join would return only one row: {5}.

In other words, flattening can cause too many duplicate rows—and if you try to get rid of the duplicates, you can end up with too few duplicate rows.


cheat sheet


Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License