Part 8. Q1.1:
There are three steps to complete this process:
According to this step, we do a complete
research with details about the cloud “with its supported & not supported
characteristics”. Therefore, we can know the essential changes in application that
After finding some differences in the
previous step, the developers improve the new database using these differences.
In this step, we isolate database activity and testers make tests
to examine functionality and performance that concentrate on some necessary
criteria such as: Error handing, Logging, Security and Network issues.
process is called “Clustering” that is technique is used to group similar records
collectively in order to enable the end user to have a high level of what is
happening in the database.
In this process,
we may think that each object as being introduced by some feature vector in an “n”
dimensional space, “n” being the number of all the features thatused to
describe the objects for clustering. Then, the algorithm chooses “k” points randomly
in this vector space, these points work as the initial places of clusters. Thereafter,
all objects are assigned individually to center they are closest to.
After that, we compute a new center
for every cluster through averaging the feature vectors of all objects that are
assigned to it. The process of re-constructing the centers and assigning the
objects is repeated until the process assembled. Moreover, after a finite
number of iterations. The algorithm can be confirmed to converge.
The following are the most famous
data mining parameters:
1. Association: Looking for
patterns where events are connected together.
2. Path analysis (Sequence): Looking
for patterns where one event causes another later event
3. Classification: Looking for the patterns
4. Clustering: discovering and documenting
groups of new facts that not previously known visually.
Examples of Non-relational databases:
Graph, Key-value store, Object
store (RDF) database, Tuple store.
they used in the real world?
The most common being MongoDB, Cassandra, DocumentDB, Coachbase, Redis,
HBase and Neo4j.
Why would you
choose to use a non-relational database over relational databases?
Motives for this approach include:
ü The design is simple and there is no need for dealing with the “impedance
mismatch” between the schema that are based on rows and tables of a relational
database and the object-oriented approach that is used to write applications.
ü Finer control over availability: without the application downtime, Servers
can be removed or added.
ü Better “horizontal” scaling to clusters of machines, which used
to solve the problem when the number of concordant users skyrockets for the accessible
applications via mobile devices and web.
ü Capturing all types of data easily “Big Data” which are
consisted of semi-structured and unstructured data.
ü Allowing for conducting a flexible database that can accommodate any
new type of data easily and quickly and is not torn by the changes in the content
ü Cost: NoSQL databases always use cheap commodity servers’ clusters,
but RDBMS tend to depend on storage systems and expensive proprietary servers.
ü Speed: The data structures used by NoSQL databases such as JSON
documents is different from the data structures used by default in the relational
databases. Actually, in NoSQL, operations are made faster than relational
databases because there is no need to join tables.