Putting the above altogether, the following is the complete code to generate the 2D scatter plot using matplotlib: Plt.title(‘First Two Dimensions of Projected Data After Applying PCA’)ĢD scatter plot generated using matplotlib Legend_plt = ax.legend(*scatter.legend_elements(), Scatter = ax.scatter(x_pca, x_pca, c=train_labels, s=5) The code also creates a legend and adds a title to the plot. The c argument to scatter() method specifies a value that will become its color. The plot is created using the axes object’s scatter() function, which takes the x- and y-coordinates as the first two argument. The following code generates a scatter plot using matplotlib. We can further color the point according to which digit it corresponds to. Let’s consider the last two columns as the x- and y-coordinates and make the point of each row in the plot. X_pca = tensordot(x, eigenvectors, axes=1)ģ largest eigenvalues: tf.Tensor(, shape=(3,), dtype=float32) Print(‘3 largest eigenvalues: ‘, eigenvalues) # Eigen-decomposition from a 784 x 784 matrixĮigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1)) X = convert_to_tensor(np.reshape(x_train, (x_train.shape, -1)), # Convert the dataset into a 2D array of shape 18623 x 784 This omission does not affect our purpose of visualization. For simplicity, we didn’t normalize the data to zero mean and unit variance before computing the eigenvectors. In the code below, we compute the eigenvectors and eigenvalues from the dataset, then projects the data of each image along the direction of the eigenvectors, and store the result in x_pca. One of the common visualizations we use in machine learning projects is the scatter plot.Īs an example, we apply PCA to the MNIST dataset and extract the first three components of each image. In the example above, we hid the “ticks” (i.e., the markers on axes) by setting xticks and yticks to empty lists. Because of that, we can gradually fine-tune a lot of details on the figure. The show() function simply display the result of a series of operations. Meaning, there is a data structure remembered internally by matplotlib and our operations will mutate it. The operations to manipulate a figure is procedural. If we want to plot on a particular axes, we can use the plotting function under the axes objects. There are a number of functions defined in matplotlib under the pyplot submodule for plotting on the default axes. There is a default figure and default axes in matplotlib. Here we can see a few properties of matplotlib. Finally, the figure will be shown using the show() function.įig,ax = plt.subplots(nrows=2, ncols=img_per_row,Īx.imshow(x_train.astype(‘int’))įirst 16 images of the training dataset displayed in 2 rows and 8 columns Then we will display each image on each axes object using the imshow() method. The subplots() function will create the axes objects for each unit. We’ll create 2 rows and 8 columns using the subplots() function. You can divide the figure into several sections called subplots, so you can put two visualizations side-by-side.Īs an example, let’s visualize the first 16 images of our MNIST dataset using matplotlib. Therefore you need to understand how matplotlib handles plots even if you’re using Seaborn. Seaborn is indeed an add-on to matplotlib. Print(‘Each image is of size ‘, img_length, ‘x’, img_width)Įach image is of size 28 x 28 Figures in matplotlib Print(‘Training data has ‘, total_examples, ‘images’) Total_examples, img_length, img_width = x_train.shape X_train, train_labels = x_train, train_labels Ind = np.where(train_labels < total_classes) (x_train, train_labels), (_, _) = mnist.load_data() To keep things simple, we’ll retain only the subset of data containing the first three digits. We load the MNIST dataset from keras.datasets library. The code afterwards will assume the following imports are executed:įrom import mnistįrom import Sequentialįrom import Dense, Reshapeįrom tensorflow import convert_to_tensor, linalg, transposeįrom bokeh.models import Legend, LegendItem Hence we will also need to install Tensorflow and pandas: We will load it from Tensorflow and run PCA algorithm on it. To install them using pip, run the following command:įor demonstration purposes, we will also use the MNIST handwritten digits dataset. They are all external libraries need to be installed. In this post, we will use matplotlib, seaborn, and bokeh. More on visualization Preparation of scatter data Line plots in matplotlib, Seaborn, and Bokeh This tutorial is divided into 7 parts they are: Photo by Mehreen Saeed, some rights reserved. Data Visualization in Python With matplotlib, Seaborn and Bokeh
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |