Name Resolution

advertisement
Name resolution, default schema, implicit schema
This document describes name resolution behavior of schema objects, types and XML
schema collections, and how it interacts with default schema, compile plan implicit
schema.
I. Context/Background
The following shows the general syntax for referring to an object/type/XML-Schema-collection
in SQL Server:
[server.] [database.] [schema.] entity
where:

entity is the name of the object/type/XML Schema collection.

schema is the schema containing the entity. The schema qualifier lets you refer to an
entity in a schema other than default or implicit schema. Only schema entities can be
qualified with schema. Schema entities are schema scoped entities including types, XML
schema collections, and objects such as tables, views, procedures, functions, DML
triggers, constraints, rules, defaults, etc. Non-schema objects, such as DDL triggers, event
notifications, cannot be qualified with schema because they are not schema scoped.

database is the database context of the entity. This identifier is only valid for schema
scoped objects but not types or XML collections.

server applies only when you are referencing to an object on linked/remote server. This is
the name of the linked server containing the object. The server qualifier lets you refer to
an object in a server other than your local server instance. Not all SQL statements allow
you to access objects on linked/remote servers. If server part is specified, the object is
resolved by delegating to the specified linked/remote server.
You can include spaces around the periods separating the parts of the reference to the entity, or
omit database or schema part between the periods. If names of any part include special characters,
you can use quoted identifier (BOL Identifier section).
In SQL 2000, types are owned by users and they are not schema scoped, i.e. their names are
unique in the containing database. The only way to reference a given type is to use single part
(entity) name.
1.2. System Objects
In SQL2005, System Objects are shipped with SQL Server in resource database. (See resource
database CSS Material) These include views, scalar- and table-functions, stored-procedures, and
extended-stored-procedures. Except Information-Schema (ANSI) views that belong to system
schema “INFORMATION_SCHEMA”, all system objects belong to a new, reserved system
schema “sys”. (Note: If a user database has a schema called “sys” it cannot be upgraded.)
These system objects can be visible (or resolvable) in every database, and compiled and/or
executed in a database as if it were created in that database. The following are feature-definitions
of the different types of System Objects.

System-Stored-Procedure – a System Object that has a name beginning with “sp_” and
is either a stored-procedure or an extended-stored-procedure.

System-Extended-Stored-Procedure – a System Object that has a name beginning with
“xp_” and is either an extended-stored-procedure or a stored-procedure. Unless schema
‘sys’ is qualified with or implied, it is visible only in “master” database (and thus always
executes only in master).

System View – a System Object that is a view. It can be a backward compatible view, a
catalog view, a dynamic management view (DMV) or an Information-Schema view.
Unless schema context is implied or it is a backward compatible view, explicit schema
‘sys’ or ‘INFORMATION_SCHEMA’ must be qualified with.

System Function – a System Object that is a function (scalar or table) that “exists” (or is
visible) in every database. It is recommended that the name begins with “fn_”.
Most of these system objects are migrated from SQL2000, and they were contained in schema
“dbo” or “system_function_schema” (unbind able and hidden, only contains system functions) of
master database. Back then, SQL Server name resolution and execution context have following
special behaviors to handle system objects:
1. Resolve “anydb.dbo.sp_” to “master.dbo.sp_”, but execute in “anydb” context.
When you run either of the following two batches:
USE my_database; EXECdbo.sp_my_procedure; go
EXEC my_database.dbo.sp_my_procedure; go
SQL Server 2000 searches master.dbo before my_database.dbo, therefore finding the system
procedure sp_my_procedure if it exists. This means that a fully-qualified 3-part (database,
schema and object) name may not literally resolve to what it says it should, rather a system object
from master database could be resolved instead and evaluated in the context of the specified
database. In other words, user procedures (if resolved) are always compiled and executed in the
database where they exist, but system procedures are compiled and executed in every database.
Clearly, we are hijacking user’s namespace in this scenario, i.e. if a user created procedure, which
is contained in schema “dbo”, has the same name as system procedure, the user will not be able to
execute this procedure.
Important Note: In above example, SQL 2000 considers dbo.sp_my_procedure in master as
system procedure only if it is marked as MS-shipped. If it is not marked as MS-shipped, then the
user procedure in my_database will be resolved before it. However, if the user procedure in
master was resolved indeed, then it would be compiled and executed in master database context,
except SQL 65 system tables contained in it, which are compiled in my_database context. This
behavior was for backward compatible. It is still preserved in SQL2005 for TSQL procedures
(not CLR procedures) but will be removed in some future releases.
2. System table-valued functions must be invoked with special “::” prefix, system scalar-valued
functions must be invoked without any schema.
In SQL 2000, system functions were contained in hidden schema “system_function_schema”,
which cannot be referenced. User defined functions must be qualified with schema. As a
consequence, this rule makes system functions unambiguously invoked. In SQL2005, the “::”
syntax is being deprecated, rather it is recommended to qualify with schema “sys”. Compatibility
will be maintained only for the System Functions shipped in SQL2000 by keeping an internal
table to map the old syntax to the replacement system function.
3. Information-Schema views must be referenced using schema “INFORMATION_SCHEMA’,
and they exist in master database but are compiled in every database.
4. All other MS-shipped “system” objects in master database, including extended procedures,
user defined functions, system tables, have to be invoked by specifying master database, and
thus always invoked in master database context.
1.3 Important Terminologies
[User’s Default Schema] In SQL 2000, there is an implicit relationship between users and
schemas. Users (actually all database principals) by default are owners of a schema that is of the
same name as the user (database principal). This implicit relationship is explicitly persisted upon
upgrade from SQL 2000. SQL2005 enables an association of a user with a default schema that is
not of the same name as the user, or multiple users to the same default schema. If a user’s default
schema is not specified while creating the user, dbo is the default schema.
[Execute-As Context] A procedure/function can be executed with the privileges of its owner or
with the privileges of the current user (caller), or with the privileges of arbitrary user, depending
on the procedure/function definition.
Standalone Execute-As/SETUSER can be executed in current session or inside
procedure/function.
[The Current Login/User] The session login/user is either the logged-in login/user or the
login/user associated with the remote linked/remote-server session. Without invoke any ExecuteAs context, either thru procedure/function or standalone, the current login/user is the session
login/user.
On database context switch, the user registered in the new database that is mapped to current
login is the new current user. If no user is registered for the login, guest user is used.
On entering an Execute-As context (except execute as caller), the specified login/user becomes
the current login/user. On exiting an Execute-As context, the current login/user reverts to the
previous current login/user, that is, the current login/user that triggered the Execute-As context
switch.
[Active Default Schema] Active default schema is schema of the SQL module if entity referenced
by a statement (except DDL, dynamic SQL, a.k.a. EXECUTE statements, or intrinsic such as
object_id()) inside a procedure, function or view. Otherwise, active default schema is current
user’s default schema.
II. Name Resolution Algorithm for Types and XML Schema Collections
2. 1. Algorithm
As implied by general syntax for referencing entities, types or XML schema collections cannot be
referenced across database. If schema name is unspecified, SQL server uses “sys first” algorithm
as outlined below.
1. First, Look in the “sys” schema if not DDL access, then
2. Look in the active default schema, then
3. Look in “dbo” schema.
If schema name is specified, SQL server simply looks in the designated schema.
Note:


Database collation is used to match schema, type and XML schema collection names.
DDL access includes CREATE/DROP/ALTER on types or XML schema collections.
2.2. Backward Compatibility
During upgrade, system types are put in “sys” schema, user defined types are put in “dbo”
schema. In SQL 2000, user defined types can only be created thru sp_addtype. This procedure is
modified in SQL2005 to always put newly created types in “dbo” schema. Since types can only
be referenced using single part name in SQL 2000, above algorithm clearly satisfies backward
compatibility requirement.
Note that new type DDL can be used to create types in schemas other than “dbo” and “sys”. It is
recommended to qualify with schema when referencing user defined types in applications, if
active default schema cannot be dependent on.
XML schema collections are newly introduced in SQL2005, so there is no backward
compatibility issue with it.
III. Name Resolution Algorithm for Objects
3.1 Algorithm
In a nutshell, objects name resolution follows the same rule as types and XML schema collections.
However, SQL Server name resolution and execution context used to have special behaviors in
SQL2000 as outlined in Section I, not to mention the fact that system objects are now migrated to
resource database, and user-schema separation feature is introduced. Clearly name resolution
algorithm needs to be more complex, with the following goals (more details later):

Avoiding luring attack

Minimizing user’s name space hijacking

Backward compatibility

Unifying algorithm with types and XML schema collections

Acceptable performance
For DDL access, including CREATE/DROP/ALTER object, the algorithm is the same as types
and XML collections. Following flow chart covers other scenarios for 1-part or 2-part names.
Let
Ss = schema specified
Sa = Active default schema
Start
Y
Y
Ss <> ‘dbo’?
Schema specified?
Look in Ss schema
Look in sys schema, if any of –
1) prefix ‘sp_’;
2) prefix ‘fn_’, ‘xp_’ and master context if dbo schema specified;
3) SQL 2K db-scoped catalog name;
4) SQL 2K server-scoped catalog name and db-context is master.
Y
Y
Found?
Found?
Look in Sa schema
Look in Ss schema
Found?
Found?
Y
Y
Look in dbo schema
Found?
Y
Look in dbo schema in master,
Match only if type is [extended] procedure, view, table.
End
If server name is specified, the object reference is resolved by the specified remote/linked server.
If database name is specified, the object is resolved in the specified database context. (For system
objects which are visible in every database, it means compile/execution database context.) It is
worth to mention the case when schema name is omitted though. Under this circumstance, the
active default schema is defined as following:
 For object references outside SQL modules, or DDL, dynamic SQL inside SQL modules,
current user’s default schema in the specified database is used.
 Otherwise, the module schema owner’s default schema in the specified database is used.
3.2
Avoi
ding
Lurin
g
Attac
k
E.g. suppose user bar owns schema foo in database nwind, when pubs..tab1 is resolved,
default schema of the user in pubs that maps to the same login as bar (or guest) is the
active default schema.
use nwind
go
create procedure foo.proc1
as
select * from pubs..tab1
go
Consider following scenario system admin uses an unqualified name reference to a system
procedure from an arbitrary database/schema context.
use database1
go
exec sp_addlogin 'somelogin', 'somepassword'
go
We recommend users to qualify with schema “sys” when referencing system procedures in
SQL2005. However, existing applications may well rely on SQL 2000 name resolution special
behavior to bind system procedure (with sp_ prefix) tighter than user procedures. If we did not
force a “sys first” strategy, the unqualified name reference shown above would bind to
dbo.sp_addlogin in database1. A “malicious dbo” could then get a system admin to execute
arbitrary code.
3.2 Minimizing User’s Name Space Hijacking
If schema not specified or dbo is specified, look in sys schema, any of –
1) prefix ‘sp_’;
2) prefix ‘fn_’, ‘xp_’ and master context if dbo schema specified;
3) SQL 2K db-scoped catalog name;
4) SQL 2K server-scoped catalog name and db-context is master.
As illustrated in the algorithm
logic, system objects are resolved
first to avoid luring attack.
Unfortunately, it is hijacking
user’s namespace, i.e. if Microsoft
SQL Server ships system stored
procedure sp_foo in SQL2005 or future releases, user stored procedure with same name, which is
contained in dbo schema, can no longer be used.
The strategy is for system objects to use prefix naming standards, and requiring users to qualify
with schema otherwise, such as catalog views, Information-Schema views and dynamic
management views. End users are discouraged to use MS-reserved name prefixes “sp_/fn_/xp_”.
We may introduce SQL-path or similar feature in next release to improve this, as well as address
user’s desire to choose different name resolution search order in general.
3.3 Backward Compatibility
As we can tell from the algorithm, the special behavior of SQL 2000 name resolution with respect
to system objects is preserved. There is subtle difference between SQL2005 and SQL 2000 with
respect to collation used to match names though.
In SQL2005, system object names are matched using collation of current context database. This
means if you are using case sensitive collation database, system object name must be exactly
matched. E.g. “exec SP_help” will fail to resolve as system procedure sp_help in database with
Latin1_General_BIN collation.
The reason is that system objects logically exist in every database, ideally they should be treated
same way as user objects, which are always resolved using context database collation.
This behavior change from SQL 2000 is summarized in the following table. This makes
difference on SQL Server instances where user databases use different collation with server.
(System databases use server collation.)
System object collation difference between SQL 2000 and SQL2005
Schemas
Stored procedures,
functions
Functions
Information-Schema
views
Tables
Tables
SQL 2000
Database collation
Server collation
(shipped in master-db and
contained in schema dbo)
SQL2005
Database collation
Database collation
(shipped in resource-db
contained in schema sys)
Keyword collation, i.e.
Latin1_General_CI_AS_KS_WS
(shipped in master-db and
contained in schema
system_function_schema)
Server collation
(shipped in master-db and
contained in schema
INFORMATION_SCHEMA)
Server collation
(shipped in master-db and
contained in schema dbo)
Database collation
(shipped in resource-db
contained in schema sys)
Database collation
(shipped in all databases and
contained in schema dbo)
Database collation
(shipped in resource-db
contained in schema
INFORMATION_SCHEMA)
Database collation
(changed to back-compatible
views, shipped in resource-db
and contained in schema sys)
Database collation
(changed to back-compatible
views, shipped in resource-db
and contained in schema sys)
IV. Implicit schema (Compile Plan)
A compile plan may refer to non-qualified schema object. It needs to record that fact an
“implicit” schema is used (Name resolution detect such scenario and default schema is recorded
in the compile plan, which is used as part of the plan cache lookup key). Batches with unqualified
object names result in non-reuse of query plans. For example, in "SELECT * FROM MyTable",
MyTable may legitimately resolve to Alice.MyTable if Alice issues this query, and she owns a
table with that name. Similarly, MyTable may resolve to Bob.MyTable. In such cases, SQL
Server does not reuse query plans. If, however, Alice issues "SELECT * FROM dbo.MyTable",
there is no ambiguity because the object is uniquely identified, and query plan reuse can happen.
(See the uid column in sys.syscacheobjects. It indicates the user ID for the connection in which
the plan was generated. Only query plans with the same user ID are candidates for reuse. When
uid = -2, it means that the query does not depend on implicit name resolution, and can be shared
among different user IDs.)
Download