Based on Reading the Chapter, One Reason That Search Metaphors for Memory Are Insufficient Is That
Introduction
The idea of creating machines which learn by themselves has been driving humans for decades now. For fulfilling that dream, unsupervised learning and clustering is the key. Unsupervised learning provides more flexibility, merely is more than challenging also.
Clustering plays an important part to draw insights from unlabeled data. It classifies the data in similar groups which improves various business decisions by providing a meta understanding.
In this skill examination, we tested our community on clustering techniques. A total of 1566 people registered in this skill test. If you missed taking the test, here is your opportunity for you to observe out how many questions you could accept answered correctly.
If y'all are simply getting started with Unsupervised Learning, here are some comprehensive resource to assist you in your journey:
- Machine Learning Certification Form for Beginners
-
The Well-nigh Comprehensive Guide to K-Means Clustering You'll Ever Need
- Certified AI & ML Blackbelt+ Program
Overall Results
Below is the distribution of scores, this will assistance you lot evaluate your performance:
Yous tin access your performance hither. More than 390 people participated in the skill exam and the highest score was 33. Hither are a few statistics about the distribution.
Overall distribution
Mean Score: 15.11
Median Score: 15
Mode Score: xvi
Helpful Resource
An Introduction to Clustering and different methods of clustering
Getting your clustering right (Part I)
Getting your clustering right (Part II)
Questions & Answers
Q1. Moving-picture show Recommendation systems are an case of:
- Nomenclature
- Clustering
- Reinforcement Learning
- Regression
Options:
B. A. ii Only
C. 1 and two
D. 1 and three
E. 2 and 3
F. 1, two and 3
H. 1, ii, three and four
Solution: (Due east)
By and large, picture show recommendation systems cluster the users in a finite number of similar groups based on their previous activities and contour. Then, at a fundamental level, people in the same cluster are made like recommendations.
In some scenarios, this can also be approached as a classification problem for assigning the nigh appropriate movie class to the user of a specific grouping of users. Also, a picture recommendation system can exist viewed every bit a reinforcement learning problem where information technology learns by its previous recommendations and improves the future recommendations.
Q2. Sentiment Analysis is an example of:
- Regression
- Classification
- Clustering
- Reinforcement Learning
Options:
A. 1 Simply
B. i and 2
C. 1 and 3
D. 1, two and iii
East. 1, two and 4
F. one, 2, 3 and 4
Solution: (E)
Sentiment analysis at the fundamental level is the chore of classifying the sentiments represented in an image, text or spoken communication into a set of defined sentiment classes like happy, sorry, excited, positive, negative, etc. Information technology tin can besides exist viewed as a regression problem for assigning a sentiment score of say 1 to ten for a corresponding image, text or speech.
Another way of looking at sentiment analysis is to consider information technology using a reinforcement learning perspective where the algorithm constantly learns from the accuracy of past sentiment assay performed to improve the hereafter performance.
Q3. Can decision trees be used for performing clustering?
A. Truthful
B. Faux
Solution: (A)
Determination copse can also be used to for clusters in the data only clustering often generates natural clusters and is not dependent on any objective function.
Q4. Which of the following is the nigh appropriate strategy for data cleaning before performing clustering analysis, given less than desirable number of information points:
- Capping and flouring of variables
- Removal of outliers
Options:
A. ane only
B. 2 only
C. 1 and two
D. None of the above
Solution: (A)
Removal of outliers is not recommended if the data points are few in number. In this scenario, capping and flouring of variables is the most appropriate strategy.
Q5. What is the minimum no. of variables/ features required to perform clustering?
A. 0
B. i
C. 2
D. 3
Solution: (B)
At least a single variable is required to perform clustering analysis. Clustering analysis with a single variable tin be visualized with the assistance of a histogram.
Q6. For ii runs of K-Mean clustering is it expected to go same clustering results?
A. Yes
B. No
Solution: (B)
Thousand-Means clustering algorithm instead converses on local minima which might as well represent to the global minima in some cases just non ever. Therefore, information technology's brash to run the G-Ways algorithm multiple times earlier cartoon inferences about the clusters.
Yet, note that information technology's possible to receive same clustering results from K-means by setting the same seed value for each run. But that is done past simply making the algorithm cull the set of same random no. for each run.
Q7. Is it possible that Assignment of observations to clusters does not change between successive iterations in K-Means
A. Yes
B. No
C. Can't say
D. None of these
Solution: (A)
When the K-Means algorithm has reached the local or global minima, it will not change the assignment of information points to clusters for ii successive iterations.
Q8. Which of the following tin can deed as possible termination conditions in One thousand-Means?
- For a fixed number of iterations.
- Assignment of observations to clusters does non alter between iterations. Except for cases with a bad local minimum.
- Centroids do not alter betwixt successive iterations.
- Finish when RSS falls below a threshold.
Options:
A. i, 3 and iv
B. 1, 2 and 3
C. one, two and iv
D. All of the in a higher place
Solution: (D)
All four weather condition can be used every bit possible termination condition in Chiliad-Means clustering:
- This condition limits the runtime of the clustering algorithm, but in some cases the quality of the clustering will be poor because of an bereft number of iterations.
- Except for cases with a bad local minimum, this produces a good clustering, just runtimes may exist unacceptably long.
- This likewise ensures that the algorithm has converged at the minima.
- Finish when RSS falls beneath a threshold. This criterion ensures that the clustering is of a desired quality after termination. Practically, it'southward a good practise to combine it with a spring on the number of iterations to guarantee termination.
Q9. Which of the following clustering algorithms suffers from the problem of convergence at local optima?
- K- Means clustering algorithm
- Agglomerative clustering algorithm
- Expectation-Maximization clustering algorithm
- Various clustering algorithm
Options:
A. ane only
B. 2 and 3
C. 2 and four
D. 1 and 3
E. 1,ii and 4
F. All of the above
Solution: (D)
Out of the options given, only K-Means clustering algorithm and EM clustering algorithm has the drawback of converging at local minima.
Q10. Which of the following algorithm is well-nigh sensitive to outliers?
A. K-means clustering algorithm
B. K-medians clustering algorithm
C. Thousand-modes clustering algorithm
D. Yard-medoids clustering algorithm
Solution: (A)
Out of all the options, K-Ways clustering algorithm is virtually sensitive to outliers equally information technology uses the mean of cluster data points to notice the cluster middle.
Q11. After performing K-Means Clustering analysis on a dataset, y'all observed the post-obit dendrogram. Which of the following conclusion can be drawn from the dendrogram?
A. In that location were 28 data points in clustering analysis
B. The best no. of clusters for the analyzed data points is 4
C. The proximity function used is Average-link clustering
D. The above dendrogram interpretation is not possible for K-Means clustering analysis
Solution: (D)
A dendrogram is non possible for M-Ways clustering assay. However, 1 can create a cluster gram based on K-Ways clustering analysis.
Q12. How can Clustering (Unsupervised Learning) be used to improve the accuracy of Linear Regression model (Supervised Learning):
- Creating different models for dissimilar cluster groups.
- Creating an input characteristic for cluster ids equally an ordinal variable.
- Creating an input feature for cluster centroids as a continuous variable.
- Creating an input feature for cluster size as a continuous variable.
Options:
A. 1 only
B. 1 and 2
C. 1 and iv
D. iii but
E. two and iv
F. All of the above
Solution: (F)
Creating an input feature for cluster ids every bit ordinal variable or creating an input characteristic for cluster centroids as a continuous variable might non convey any relevant information to the regression model for multidimensional data. But for clustering in a single dimension, all of the given methods are expected to convey meaningful information to the regression model. For example, to cluster people in two groups based on their hair length, storing clustering ID as ordinal variable and cluster centroids every bit continuous variables will convey meaningful information.
Q13. What could be the possible reason(s) for producing two different dendrograms using agglomerative clustering algorithm for the same dataset?
A. Proximity office used
B. of data points used
C. of variables used
D. B and c only
East. All of the above
Solution: (Due east)
Change in either of Proximity office, no. of data points or no. of variables will lead to different clustering results and hence different dendrograms.
Q14. In the effigy below, if you lot draw a horizontal line on y-centrality for y=2. What will exist the number of clusters formed?
A. 1
B. two
C. 3
D. 4
Solution: (B)
Since the number of vertical lines intersecting the red horizontal line at y=ii in the dendrogram are 2, therefore, ii clusters will exist formed.
Q15. What is the most appropriate no. of clusters for the data points represented by the post-obit dendrogram:
A. 2
B. 4
C. 6
D. 8
Solution: (B)
The conclusion of the no. of clusters that can best depict different groups can be chosen by observing the dendrogram. The best choice of the no. of clusters is the no. of vertical lines in the dendrogram cut by a horizontal line that tin transverse the maximum distance vertically without intersecting a cluster.
In the above example, the best option of no. of clusters will exist iv every bit the red horizontal line in the dendrogram below covers maximum vertical altitude AB.
Q16. In which of the post-obit cases will Thousand-Means clustering neglect to give adept results?
- Data points with outliers
- Data points with dissimilar densities
- Data points with round shapes
- Data points with non-convex shapes
Options:
A. 1 and 2
B. ii and 3
C. two and 4
D. 1, 2 and iv
East. 1, 2, 3 and 4
Solution: (D)
K-Means clustering algorithm fails to give skilful results when the data contains outliers, the density spread of data points across the data space is different and the data points follow not-convex shapes.
Q17. Which of the post-obit metrics, do we accept for finding dissimilarity between two clusters in hierarchical clustering?
- Single-link
- Complete-link
- Average-link
Options:
A. ane and 2
B. one and 3
C. 2 and 3
D. 1, 2 and iii
Solution: (D)
All of the iii methods i.e. unmarried link, complete link and average link tin be used for finding dissimilarity between two clusters in hierarchical clustering.
Q18. Which of the post-obit are true?
- Clustering assay is negatively affected by multicollinearity of features
- Clustering analysis is negatively afflicted by heteroscedasticity
Options:
A. 1 just
B. 2 simply
C. 1 and 2
D. None of them
Solution: (A)
Clustering analysis is not negatively afflicted by heteroscedasticity merely the results are negatively impacted by multicollinearity of features/ variables used in clustering as the correlated feature/ variable will carry actress weight on the distance calculation than desired.
Q19. Given, half dozen points with the following attributes:
Which of the following clustering representations and dendrogram depicts the utilize of MIN or Single link proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (A)
For the single link or MIN version of hierarchical clustering, the proximity of 2 clusters is defined to exist the minimum of the altitude between whatever two points in the dissimilar clusters. For instance, from the table, we meet that the altitude betwixt points 3 and six is 0.eleven, and that is the height at which they are joined into i cluster in the dendrogram. Equally another example, the altitude between clusters {3, 6} and {two, five} is given past dist({iii, 6}, {2, five}) = min(dist(3, 2), dist(6, 2), dist(iii, v), dist(6, 5)) = min(0.1483, 0.2540, 0.2843, 0.3921) = 0.1483.
Q20 Given, six points with the following attributes:
Which of the following clustering representations and dendrogram depicts the utilise of MAX or Complete link proximity office in hierarchical clustering:
A.
B.
C.
D.
Solution: (B)
For the single link or MAX version of hierarchical clustering, the proximity of two clusters is defined to be the maximum of the distance between whatsoever ii points in the different clusters. Similarly, here points 3 and 6 are merged first. Notwithstanding, {3, 6} is merged with {4}, instead of {ii, 5}. This is because the dist({3, half-dozen}, {4}) = max(dist(3, 4), dist(6, 4)) = max(0.1513, 0.2216) = 0.2216, which is smaller than dist({three, half-dozen}, {2, 5}) = max(dist(three, ii), dist(6, two), dist(iii, 5), dist(6, 5)) = max(0.1483, 0.2540, 0.2843, 0.3921) = 0.3921 and dist({iii, half dozen}, {i}) = max(dist(3, 1), dist(6, one)) = max(0.2218, 0.2347) = 0.2347.
Q21 Given, half dozen points with the post-obit attributes:
Which of the following clustering representations and dendrogram depicts the use of Group average proximity function in hierarchical clustering:
A.
B.
C.
D.
Solution: (C)
For the group average version of hierarchical clustering, the proximity of two clusters is defined to be the average of the pairwise proximities between all pairs of points in the different clusters. This is an intermediate arroyo between MIN and MAX. This is expressed by the post-obit equation:
Here, the distance between some clusters. dist({3, 6, 4}, {ane}) = (0.2218 + 0.3688 + 0.2347)/(3 ∗ 1) = 0.2751. dist({2, five}, {one}) = (0.2357 + 0.3421)/(2 ∗ one) = 0.2889. dist({3, vi, iv}, {2, 5}) = (0.1483 + 0.2843 + 0.2540 + 0.3921 + 0.2042 + 0.2932)/(6∗ane) = 0.2637. Because dist({3, six, 4}, {2, 5}) is smaller than dist({iii, 6, iv}, {one}) and dist({ii, 5}, {1}), these ii clusters are merged at the fourth stage
Q22. Given, six points with the following attributes:
Which of the following clustering representations and dendrogram depicts the use of Ward's method proximity role in hierarchical clustering:
A.
B.
C.
D.
Solution: (D)
Ward method is a centroid method. Centroid method calculates the proximity between two clusters by calculating the distance betwixt the centroids of clusters. For Ward'south method, the proximity between two clusters is defined every bit the increase in the squared error that results when ii clusters are merged. The results of applying Ward's method to the sample data set of six points. The resulting clustering is somewhat different from those produced past MIN, MAX, and group average.
Q23. What should be the all-time pick of no. of clusters based on the post-obit results:
A. i
B. 2
C. 3
D. 4
Solution: (C)
The silhouette coefficient is a measure of how similar an object is to its own cluster compared to other clusters. Number of clusters for which silhouette coefficient is highest represents the best choice of the number of clusters.
Q24. Which of the post-obit is/are valid iterative strategy for treating missing values before clustering analysis?
A. Imputation with hateful
B. Nearest Neighbor consignment
C. Imputation with Expectation Maximization algorithm
D. All of the above
Solution: (C)
All of the mentioned techniques are valid for treating missing values before clustering analysis just only imputation with EM algorithm is iterative in its operation.
Q25. Chiliad-Mean algorithm has some limitations. One of the limitation it has is, it makes difficult assignments(A point either completely belongs to a cluster or not belongs at all) of points to clusters.
Note: Soft assignment can exist consider as the probability of existence assigned to each cluster: say K = iii and for some point xn, p1 = 0.7, p2 = 0.2, p3 = 0.1)
Which of the post-obit algorithm(s) allows soft assignments?
- Gaussian mixture models
- Fuzzy K-means
Options:
A. 1 only
B. 2 only
C. 1 and 2
D. None of these
Solution: (C)
Both, Gaussian mixture models and Fuzzy K-means allows soft assignments.
Q26. Presume, you desire to cluster vii observations into three clusters using K-Means clustering algorithm. Afterwards first iteration clusters, C1, C2, C3 has following observations:
C1: {(two,two), (4,4), (6,6)}
C2: {(0,4), (iv,0)}
C3: {(5,5), (nine,9)}
What volition be the cluster centroids if you desire to proceed for second iteration?
A. C1: (4,4), C2: (2,2), C3: (7,7)
B. C1: (vi,six), C2: (iv,4), C3: (9,9)
C. C1: (two,2), C2: (0,0), C3: (5,5)
D. None of these
Solution: (A)
Finding centroid for data points in cluster C1 = ((2+4+6)/3, (two+4+half-dozen)/3) = (iv, four)
Finding centroid for information points in cluster C2 = ((0+4)/2, (iv+0)/2) = (2, 2)
Finding centroid for data points in cluster C3 = ((5+9)/2, (5+9)/ii) = (seven, 7)
Hence, C1: (4,4), C2: (2,2), C3: (7,7)
Q27. Assume, you want to cluster 7 observations into three clusters using Thou-Means clustering algorithm. Afterwards commencement iteration clusters, C1, C2, C3 has following observations:
C1: {(ii,2), (4,four), (vi,half-dozen)}
C2: {(0,4), (iv,0)}
C3: {(v,5), (ix,nine)}
What will exist the Manhattan distance for ascertainment (ix, 9) from cluster centroid C1. In 2d iteration.
A. x
B. 5*sqrt(2)
C. 13*sqrt(ii)
D. None of these
Solution: (A)
Manhattan distance betwixt centroid C1 i.east. (4, 4) and (9, ix) = (9-four) + (9-4) = 10
Q28. If two variables V1 and V2, are used for clustering. Which of the post-obit are truthful for K means clustering with k =iii?
- If V1 and V2 has a correlation of 1, the cluster centroids will be in a straight line
- If V1 and V2 has a correlation of 0, the cluster centroids will be in straight line
Options:
A. 1 only
B. ii only
C. 1 and two
D. None of the above
Solution: (A)
If the correlation between the variables V1 and V2 is 1, then all the information points will be in a direct line. Hence, all the 3 cluster centroids will form a straight line as well.
Q29. Feature scaling is an important step before applying K-Mean algorithm. What is reason behind this?
A. In altitude calculation it will give the same weights for all features
B. You always get the aforementioned clusters. If you lot use or don't use feature scaling
C. In Manhattan distance information technology is an important step but in Euclidian it is not
D. None of these
Solution; (A)
Feature scaling ensures that all the features get same weight in the clustering analysis. Consider a scenario of clustering people based on their weights (in KG) with range 55-110 and tiptop (in inches) with range 5.6 to 6.4. In this instance, the clusters produced without scaling tin be very misleading as the range of weight is much higher than that of meridian. Therefore, its necessary to bring them to same scale so that they accept equal weightage on the clustering result.
Q30. Which of the following method is used for finding optimal of cluster in Chiliad-Mean algorithm?
A. Elbow method
B. Manhattan method
C. Ecludian mehthod
D. All of the in a higher place
E. None of these
Solution: (A)
Out of the given options, only elbow method is used for finding the optimal number of clusters. The elbow method looks at the percentage of variance explained as a function of the number of clusters: One should cull a number of clusters so that adding another cluster doesn't give much better modeling of the data.
Q31. What is true near K-Mean Clustering?
- 1000-ways is extremely sensitive to cluster center initializations
- Bad initialization tin lead to Poor convergence speed
- Bad initialization tin lead to bad overall clustering
Options:
A. ane and 3
B. 1 and 2
C. ii and iii
D. 1, 2 and iii
Solution: (D)
All three of the given statements are truthful. Chiliad-means is extremely sensitive to cluster center initialization. Also, bad initialization can pb to Poor convergence speed likewise as bad overall clustering.
Q32. Which of the following can be practical to go adept results for K-ways algorithm corresponding to global minima?
- Attempt to run algorithm for dissimilar centroid initialization
- Adjust number of iterations
- Observe out the optimal number of clusters
Options:
A. 2 and 3
B. one and iii
C. ane and 2
D. All of above
Solution: (D)
All of these are standard practices that are used in order to obtain expert clustering results.
Q33. What should be the best option for number of clusters based on the following results:
A. 5
B. 6
C. fourteen
D. Greater than 14
Solution: (B)
Based on the above results, the all-time pick of number of clusters using elbow method is 6.
Q34. What should be the best choice for number of clusters based on the following results:
A. ii
B. 4
C. 6
D. eight
Solution: (C)
More often than not, a higher average silhouette coefficient indicates ameliorate clustering quality. In this plot, the optimal clustering number of grid cells in the report area should be 2, at which the value of the average silhouette coefficient is highest. Yet, the SSE of this clustering solution (k = 2) is also large. At k = half dozen, the SSE is much lower. In add-on, the value of the average silhouette coefficient at k = 6 is also very high, which is but lower than k = 2. Thus, the best option is thousand = vi.
Q35. Which of the following sequences is correct for a K-Means algorithm using Forgy method of initialization?
- Specify the number of clusters
- Assign cluster centroids randomly
- Assign each information indicate to the nearest cluster centroid
- Re-assign each point to nearest cluster centroids
- Re-compute cluster centroids
Options:
A. 1, 2, three, 5, 4
B. one, 3, 2, 4, 5
C. 2, i, 3, 4, 5
D. None of these
Solution: (A)
The methods used for initialization in Thou ways are Forgy and Random Partition. The Forgy method randomly chooses thou observations from the data fix and uses these as the initial means. The Random Partition method first randomly assigns a cluster to each observation and and so gain to the update step, thus computing the initial hateful to exist the centroid of the cluster's randomly assigned points.
Q36. If yous are using Multinomial mixture models with the expectation-maximization algorithm for clustering a set of information points into two clusters, which of the assumptions are important:
A. All the data points follow two Gaussian distribution
B. All the data points follow n Gaussian distribution (northward >ii)
C. All the data points follow ii multinomial distribution
D. All the data points follow n multinomial distribution (n >2)
Solution: (C)
In EM algorithm for clustering its essential to choose the aforementioned no. of clusters to classify the information points into equally the no. of different distributions they are expected to exist generated from and too the distributions must be of the same blazon.
Q37. Which of the following is/are not true well-nigh Centroid based K-Means clustering algorithm and Distribution based expectation-maximization clustering algorithm:
- Both starts with random initializations
- Both are iterative algorithms
- Both have strong assumptions that the data points must fulfill
- Both are sensitive to outliers
- Expectation maximization algorithm is a special instance of Grand-Means
- Both requires prior knowledge of the no. of desired clusters
- The results produced by both are non-reproducible.
Options:
A. ane only
B. 5 just
C. 1 and iii
D. half dozen and seven
E. 4, 6 and seven
F. None of the above
Solution: (B)
All of the higher up statements are truthful except the 5thursday every bit instead K-Means is a special instance of EM algorithm in which but the centroids of the cluster distributions are calculated at each iteration.
Q38. Which of the post-obit is/are not true about DBSCAN clustering algorithm:
- For information points to be in a cluster, they must exist in a distance threshold to a core point
- It has strong assumptions for the distribution of data points in dataspace
- It has substantially high time complication of gild O(northward3)
- Information technology does not require prior cognition of the no. of desired clusters
- It is robust to outliers
Options:
A. ane only
B. 2 only
C. 4 only
D. 2 and iii
Due east. 1 and v
F. 1, iii and 5
Solution: (D)
- DBSCAN tin form a cluster of whatever arbitrary shape and does non accept strong assumptions for the distribution of information points in the dataspace.
- DBSCAN has a low time complexity of order O(n log n) merely.
Q39. Which of the following are the high and low premises for the existence of F-Score?
A. [0,ane]
B. (0,1)
C. [-1,i]
D. None of the in a higher place
Solution: (A)
The lowest and highest possible values of F score are 0 and 1 with 1 representing that every information point is assigned to the correct cluster and 0 representing that the precession and/ or recall of the clustering analysis are both 0. In clustering analysis, high value of F score is desired.
Q40. Following are the results observed for clustering 6000 data points into 3 clusters: A, B and C:
What is the Fane-Score with respect to cluster B?
A. 3
B. 4
C. 5
D. vi
Solution: (D)
Here,
Truthful Positive, TP = 1200
True Negative, TN = 600 + 1600 = 2200
Faux Positive, FP = 1000 + 200 = 1200
False Negative, FN = 400 + 400 = 800
Therefore,
Precision = TP / (TP + FP) = 0.5
Recall = TP / (TP + FN) = 0.six
Hence,
F1 = 2 * (Precision * Think)/ (Precision + recall) = 0.54 ~ 0.five
End Notes
I hope you enjoyed taking the test and found the solutions helpful. The examination focused on conceptual equally well as applied noesis of clustering fundamentals and its various techniques.
I tried to clear all your doubts through this commodity, but if we have missed out on something then allow u.s. know in comments below. Also, If you accept any suggestions or improvements you think we should make in the side by side skilltest, you lot can allow united states know by dropping your feedback in the comments section.
Learn, compete, hack and get hired!
Source: https://www.analyticsvidhya.com/blog/2017/02/test-data-scientist-clustering/
0 Response to "Based on Reading the Chapter, One Reason That Search Metaphors for Memory Are Insufficient Is That"
Publicar un comentario