Uploaded by Patel Parth

CSUP EXTERNAL NOTES

advertisement
LAQ
1.Explain and write about the Seven Features of Datastore? What do you
understand about it?
Datastore is a NoSQL document database built for automatic scaling, high performance, and ease of
application development. Datastore features include:
1. Atomic transactions. Datastore can execute a set of operations where either all succeed, or
none occur.
2. High availability of reads and writes. Datastore runs in Google data centers, which use
redundancy to minimize impact from points of failure.
3. Massive scalability with high performance. Datastore uses a distributed architecture to
automatically manage scaling. Datastore uses a mix of indexes and query constraints so your
queries scale with the size of your result set, not the size of your data set.
4. Flexible storage and querying of data. Datastore maps naturally to object-oriented and
scripting languages, and is exposed to applications through multiple clients. It also provides a
SQL-like query language.
5. Balance of strong and eventual consistency. Datastore ensures that entity lookups by key
and ancestor queries always receive strongly consistent data. All other queries are eventually
consistent. The consistency models allow your application to deliver a great user experience
while handling large amounts of data and users.
6. Encryption at rest. Datastore automatically encrypts all data before it is written to disk and
automatically decrypts the data when read by an authorized user. For more information, see
Server-Side Encryption.
7. Fully managed with no planned downtime. Google handles the administration of the
Datastore service so you can focus on your application. Your application can still use Datastore
when the service receives a planned upgrade.
2.What is Metadata? Categories of Metadata, and Discuss the Role of Metadata?
Metadata is simply defined as data about data.
The data that is used to represent other data is known as metadata. For example, the index of a
book serves as a metadata for the contents in the book. In other words, we can say that metadata is
the summarized data that leads us to detailed data. In terms of data warehouse, we can define
metadata as follows.
● Metadata is the road-map to a data warehouse.
● Metadata in a data warehouse defines the warehouse objects.
● Metadata acts as a directory. This directory helps the decision support system to locate
the contents of a data warehouse.
Categories of Metadata
Metadata can be broadly categorized into three categories −
● Business Metadata − It has the data ownership information, business definition, and
changing policies.
● Technical Metadata − It includes database system names, table and column names and
sizes, data types and allowed values. Technical metadata also includes structural
information such as primary and foreign key attributes and indices.
● Operational Metadata − It includes currency of data and data lineage. Currency of data
means whether the data is active, archived, or purged. Lineage of data means the
history of data migrated and transformation applied on it.
Role of Metadata
Metadata has a very important role in a data warehouse. The role of metadata in a warehouse is
different from the warehouse data, yet it plays an important role. The various roles of metadata are
explained below.
● Metadata acts as a directory.
● This directory helps the decision support system to locate the contents of the data
warehouse.
● Metadata helps in summarization between current detailed data and highly
summarized data.
● Metadata also helps in summarization between lightly detailed data and highly
summarized data.
● Metadata is used for query tools.
● Metadata is used in extraction and cleansing tools.
● Metadata is used in reporting tools.
● Metadata is used in transformation tools.
● Metadata plays an important role in loading functions.
3.What is Repository in Metadata (any Five) and Describe the challenges for
metadata management?
Metadata repository is an integral part of a data warehouse system. It has the following metadata −
● Definition of data warehouse − It includes the description of structure of data warehouse.
The description is defined by schema, view, hierarchies, derived data definitions, and
data mart locations and contents.
● Business metadata − It contains the data ownership information, business definition, and
changing policies.
● Operational Metadata − It includes currency of data and data lineage. Currency of data
means whether the data is active, archived, or purged. Lineage of data means the
history of data migrated and transformation applied on it.
● Data for mapping from operational environment to data warehouse − It includes the
source databases and their contents, data extraction, data partition cleaning,
transformation rules, data refresh and purging rules.
● Algorithms for summarization − It includes dimension algorithms, data on granularity,
aggregation, summarizing, etc.
Challenges for Metadata Management
The importance of metadata can not be overstated. Metadata helps in driving the accuracy of reports,
validates data transformation, and ensures the accuracy of calculations. Metadata also enforces the
definition of business terms to business end-users. With all these uses of metadata, it also has its
challenges. Some of the challenges are discussed below.
● Metadata in a big organization is scattered across the organization. This metadata is
spread in spreadsheets, databases, and applications.
● Metadata could be present in text files or multimedia files. To use this data for
information management solutions, it has to be correctly defined.
● There are no industry-wide accepted standards. Data management solution vendors
have narrow focus.
● There are no easy and accepted methods of passing metadata.
4.Elucidate the terms: a) Firestore in Native mode, b) How to Choose a database mode,
c) Firestore in Datastore mode, d) Pricing and Locations
(A) Firestore in Native mode
Firestore is the next major version of Datastore and a re-branding of the product. Taking the best of
Datastore and the Firebase Realtime Database, Firestore is a NoSQL document database built for
automatic scaling, high performance, and ease of application development.
Firestore introduces new features such as:
● A new, strongly consistent storage layer
● A collection and document data model
● Real-time updates
● Mobile and Web client libraries
Firestore is backwards compatible with Datastore, but the new data model, real-time updates, and
mobile and web client library features are not. To access all of the new Firestore features, you must
use Firestore in Native mode.
(B) Choosing a database mode
When you create a new Firestore database, you must select a database mode. You cannot use both
Native mode and Datastore mode in the same project. We recommend the following when choosing a
database mode:
● Use Firestore in Datastore mode for new server projects.
Firestore in Datastore mode allows you to use established Datastore server architectures while
removing fundamental Datastore limitations. Datastore mode can automatically scale to
millions of writes per second.
● Use Firestore in Native mode for new mobile and web apps.
Firestore offers mobile and web client libraries with real-time and offline features. Native mode
can automatically scale to millions of concurrent clients.
(C) Firestore in Datastore mode
Firestore in Datastore mode uses Datastore system behavior but accesses Firestore's storage layer,
removing the following Datastore limitations:
● Eventual consistency: Datastore queries become strongly consistent unless you explicitly
request eventual consistency.
● Queries in transactions are no longer required to be ancestor queries.
● Transactions are no longer limited to 25 entity groups.
● Writes to an entity group are no longer limited to 1 per second.
Datastore mode disables Firestore features that are not compatible with Datastore:
● The project will accept Datastore API requests and deny Firestore API requests.
● The project will use Datastore indexes instead of Firestore indexes.
● You can use Datastore client libraries with this project but not Firestore client libraries.
● In the Google Cloud console, the database will use the Datastore viewer.
(D) Pricing and Locations
you are Pricing depends for the following:
● The number of documents you read, write, and delete.
● The number of index entries matched by aggregation queries. You are charged one document
read for each batch of up to 1000 index entries matched by the query.
● The amount of storage that your database uses, including overhead for metadata and indexes.
● The amount of network bandwidth that you use.
Location :Several services available for your app require a location setting, called your project's default Google
Cloud Platform (GCP) resource location. This location is where your data is stored for GCP services
that require a location setting.
There are mainly two types of location :- 1. Multi-region locations 2.Regional locations
5. Explain the App Engine Architecture with Flow Chart Diagram?
App Engine is created under Google Cloud Platform project when an application resource is created.
The Application part of GAE is a top-level container that includes the service, version and
instance-resources that make up the app. When you create App Engine application, all your
resources are created in the user defined region, including app code and collection of settings,
credentials and your app's metadata.
Each GAE application includes at least one service, the default service, which can hold many
versions, depends on your app's billing status.
The following diagram shows the hierarchy of a GAE application running with two services. In this
diagram, the app has 2 services that contain different versions, and two of those versions are actively
running on different instances:
Services
Services used in GAE is to constitute our large apps into logical components that can securely share
the features of App Engine and communicate with each other. Generally, App Engine services
behave like microservices. Therefore, we can run our app in a single service or we can deploy
multiple services to run as a microservice-set.
Ex: An app which handles customer requests may include distinct services, each handle different
tasks, such as:
● Internal or administration-type requests
● Backend processing (billing pipelines and data analysis)
● API requests from mobile devices
Each service in GAE consists of the source code from our app and the corresponding App Engine
configuration files. The set of files that we deploy to a service represent a single version of that
service and each time when we deploy the set of files to that service, we are creating different
versions within that same service.
Versions
Having different versions of the app within each service allows us to quickly switch between different
versions of that app for rollbacks, testing, or other temporary events. We can route traffic to specific or
different versions of our app by migrating or splitting traffic.
Instances
The versions within our services run over one or more instances. By default, App Engine scales our
app with respect to the load accordingly. GAE will scale up the number of instances that are running
to provide a uniform performance, or scale down to minimize idle instances and reduce the overall
costs.
6. What are “Upload” and “Download” in while Deploying an Application in
Google Cloud?
Uploads
You can send upload requests to Cloud Storage in the following ways:
● Single-request upload. An upload method where an object is uploaded as a single request.
Use this if the file is small enough to upload in its entirety if the connection fails. See Upload
object from file or Upload object from memory for guides to single-request uploads.
● Upload object from memory. An upload method where an object is uploaded from memory
instead of a filesystem.
● Resumable upload. An upload method that provides a more reliable transfer, which is
especially important with large files. Resumable uploads are a good choice for most
applications, since they also work for small files at the cost of one additional HTTP request per
upload. You can also use resumable uploads to perform streaming transfers, which allows you
to upload an object of unknown size.
● XML API multipart upload. An upload method that is compatible with Amazon S3 multipart
uploads. Files are uploaded in parts and assembled into a single object with the final request.
XML API multipart uploads allow you to upload the parts in parallel, potentially reducing the
time to complete the overall upload.
Downloads
All downloads from Cloud Storage have the same basic behavior: an HTTP or HTTPS GET request
that can include an optional Range header, which defines a specific portion of the object to download.
Using this basic download behavior, you can resume interrupted downloads, and you can utilize more
advanced download strategies, such as sliced object downloads and streaming downloads.
SAQ
1. What are Similarities between First and Second Generation Runtime?
●
●
●
●
●
Nearly instantaneous scale-up time to respond to traffic spikes
Applications are built using the same build process
Same SLA for GA services
Identical gcloud command support and the same GCP console interface
Free tier
2. Explain about any Four Features of Datastore?
Write answer from above LAQ Q.1 (any four features)
3.What are Datastore Internals?
Write answer from above LAQ Q.1 (Just write the name of features)
4.Explain the terms “Documents” in User Data.
➔ A document is a form of information that might be useful to a user or set of users. This information
can be in digital and nondigital forms. Accordingly, a document can be either digital or nondigital.
Different methods are used to store digital and nondigital documents.
➔ A nondigital or paper document can be physically stored in a file cabinet, whereas an electronic or
digital document is stored in a computer as one or more files. Digital documents can also be part of a
database. Electronic document management programs deal with the management, storage and
security of electronic documents.
5. What is Memcached?
➔ Memcached is a free and open-source high-performance memory caching system. It’s typically
used to cache database data, API calls or page rendering chunks in RAM to increase the application
performance.
➔ It can store data as little as a number, or as big as a finished HTML page.
➔ The system is designed to be accessed through TCP so it can work in a separate server, and can
also be distributed among several servers, summing up a big hash table to store data.
6.Define about “FaaS” and IaaS”?
FAAS
➔ - FaaS Stand for Function as a service (FaaS)
➔ - Function as a service (FaaS) is a cloud computing model that enables cloud customers to develop
applications and deploy functionalities and only be charged when the functionality executes. FaaS is
often used to deploy microservices
➔ - it also be referred to as serverless computing.
IAAS
Infrastructure as a service (IaaS) is a form of cloud computing that provides virtualized computing
resources over the internet.
— >In the IaaS model, the cloud provider manages IT infrastructures
such as storage, server and networking resources, and delivers them to subscriber organizations via
virtual machines accessible through the internet. IaaS can have many benefits for organizations, such
as potentially making workloads faster, easier, more flexible and more cost efficient.
7.What do you Understand by “PaaS” and “Caas”?
PAAS
➔ Platform as a service (PaaS) is a cloud computing model where a third-party provider delivers
hardware and software tools to users over the internet.
➔ Usually, these tools are needed for application development. A PaaS provider hosts the hardware
and software on its own infrastructure. As a result, PaaS frees developers from having to install
in-house hardware and software to develop or run a new application.
CAAS
➔ Containers as a service (CaaS) is a cloud service that allows software developers and IT
departments to upload, organize, run, scale, manage and stop containers by using container-based
virtualization.
➔ A CaaS provider will commonly provide a framework which allows users to make use of the service.
Providers typically make use of application programming interface (API) calls or a web portal
interface.
8. Explain the “Firestore in Datastore mode” and short Description on fully
managed with no planned downtime.
WRITE ANSWER FROM ABOVE LAQ Q.4
9.Make a Short Note on “Benefits of Data Stores”.
WRITE DOWN FROM ABOVE LAQ Q.1 (write in statements not in listed features)
10.What are Tokenizing Rules?
➔ Tokenization replaces a sensitive data element, for example, a bank account number,
with a non-sensitive substitute, known as a token. The token is a randomized data
string that has no essential or exploitable value or meaning. It is a unique identifier
which retains all the pertinent information about the data without compromising its
security.
➔ A tokenization system links the original data to a token but does not provide any way
to decipher the token and reveal the original data. This is in contrast to encryption
systems, which allow data to be deciphered using a secret key.
Payment Tokenization Example
When a merchant processes the credit card of a customer, the PAN is substituted with a token.
1234-4321-8765-5678 is replaced with, for example, 6f7%gf38hfUa.
The merchant can apply the token ID to retain records of the customer, for example, 6f7%gf38hfUa is
connected to John Smith. The token is then transferred to the payment processor who de-tokenizes
the ID and confirms the payment. 6f7%gf38hfUa becomes 1234-4321-8765-5678.
The payment processor is the only party who can read the token; it is meaningless to anyone else.
Furthermore, the token is useful only with that single merchant.
11. Draw Flow Chart of App Engine Hierarchy and Label it?
Draw from above LAQ Q.5
12.What are Firestore in Native Mode
WRITE ANSWER FROM ABOVE LAQ Q.4
13.Explain the Term “Data Warehousing”?
➔ A data warehouse is a central repository of information that can be analyzed to make more informed
decisions. Data flows into a data warehouse from transactional systems, relational databases, and
other sources, typically on a regular cadence.
➔ Business analysts, data engineers, data scientists, and decision makers access the data through
business intelligence (BI) tools, SQL clients, and other analytics applications.
➔ A data warehouse is specially designed for data analytics, which involves reading large amounts of
data to understand relationships and trends across the data. A database is used to capture and store
data, such as recording details of a transaction.
➔ Unlike a data warehouse, a data lake is a centralized repository for all data, including structured,
semi-structured, and unstructured. A data warehouse requires that the data be organized in a tabular
format, which is where the schema comes into play.
14.What is Multi Tenancy?
➔ Multi-tenancy is an architecture in which a single instance of a software application serves multiple
customers. Each customer is called a tenant.
➔ Tenants can be given the ability to customize some parts of the application, such as the color of the
user interface or business rules, but they can't customize the application's code.
➔ In a multi-tenant architecture, multiple instances of an application operate in a shared environment.
This architecture is able to work because each tenant is integrated physically but is logically
separated. This means that a single instance of the software will run on one server and then serve
multiple tenants. In this way, a software application in a multi-tenant architecture can share a
dedicated instance of configurations, data, user management and other properties.
15.Explain the terms, a) BLOBS, b) USERS.
BLOBS
➔ BLOB stands for a “Binary Large Object,”
➔ BLOB is a data type that stores binary data. Binary Large Objects (BLOBs) can be complex files like
images or videos, unlike other data strings that only store letters and numbers.
➔ A BLOB will hold multimedia objects to add to a database; however, not all databases support BLOB
storage.
USERS
➔ User means any individual person making use of a CSP's Cloud Services provided to a Cloud
Customer, based on a relationship between that Cloud Customer and the Cloud User.
➔ A user is someone who employs or uses a particular thing, like a user of nicotine or a user of an
internet site. Since you are reading this, you are a user of Vocabulary.com. Congrats! To use
something is to employ it or operate it, so a user is someone who uses or takes advantage of
something.
Download