Overview & Getting started
What is it ?
If you work or are learning python, sooner or later you will bump into Numpy arrays, I’d venture that numpy, along with pandas dataframes are the workhorses of data as far as python is concerned. In layman terms Numpy arrays are data containers that can represent multiple dimensions and be queried and operated on, or if you prefer the official definition from the docs:
NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called axes.
From lists to Numpy Arrays
You might be familiar with python lists :
list = [1,2,3,4,5,6]
[1, 2, 3, 4, 5, 6]
Well, a Numpy array at first glance is a very similar concept:
import numpy as npnp_array = np.array([1, 2, 3, 4, 5, 6])
[1 2 3 4 5 6]
Do note the lack of commas which tell us we are dealing with something different than a regular list.
SideNote: Why use Numpy ? : Because it is faster than a regular list and because its use is widespread ? ¯\_(ツ)_/¯
Creating simple arrays:
Peruse the following for simple Numpy array creation:
Type and type conversion
You might have noticed that some numbers above are floats
0. and others are integers (ints)
1,2,3 , this begs the question, what types are supported and how to change them :
You can find all the types and more information here:Numpy data Types
Dimensions and Shape
So far we’ve been covering very basic Numpy Arrays but the utility and complexity goes up once we start using multiple dimensions…
Dimensions make more sense when paired with actual values or use cases, so for instance a list of grocery items can be expressed as a one dimensional Numpy array:
grocery_list = np.array(['eggs', 'milk', 'cereal', 'bacon'])print("Dimensions:", grocery_list.ndim)
>>> Dimensions: 1
>>> Shape: (4,)
You can check the number of dimensions (or axes) with
shape will give you the size of the array in each dimension, here 4 is the number of elements in our only dimension, the length of
shape in this case 1 is also the number of dimensions.
An X,Y chart consists of 2 pairs of data points and so it makes sense it can be expressed as a 2 dimensional array:
chartData = X,Y = np.array([[1, 2, 3, 4],[2, 4, 6, 8]])Dimensions: 2
Shape: (2, 4) length: 2 = 2 dimensions, first axe has 2 elements [1,2,3,4,] and [2,4,6,8], the second axe has 4 elements, also note we are assigning X and Y values upon creation which is a common way to pass values to a chart...print(X):
>>> [1 2 3 4]print(Y):
>>> [2 4 6 8]
A point in space X,Y,Z could be expressed in you guessed it 3 dimensions:
point_in_space = X,Y,Z = np.array([[],[],[]])Dimensions: 3
Shape: (3, 1, 1)
If we add a 4th dimension ( a comment ) to the point we would have:
point_in_spaceWithComment = X, Y, Z, comment = np.array([
[[['My Favorite Point']]]
Shape: (4, 1, 1, 1)
These are just examples to show you how to write from scratch dimensions and how to figure out how many dimensions you are dealing with, we could for instance rewrite the last example as something more easy on the eyes and practical:
points_in_spaceWithComments = np.array([
[2, 4, 2, 'Comment1'],
[4, 6, 8, 'Comment2'],
[1, 4, 3, 'Comment3']
Shape: (3, 4)
Data I/O and Indexing
By now I hope you have a rough idea of what a Numpy array is, to continue this overview let’s look at selecting, adding and deleting elements:
If you’ve done any work with lists on python this should look familiar, although it you look at the second example there is an
axis argument and we haven’t talked about selecting things in a multidimensional Numpy array, for that we’ll need indexing and slicing
Dealing with multiple dimensions:
Selecting, Indexing, Slicing in multiple dimensions is surprisingly simple once you know the syntax, basically add a comma to get to the next dimension, everything we covered previously still applies:
One last Numpy thing you might encounter when starting is reshaping, which as its name implies modifies an existing array into a different shape:
What else ?
Some places I’d steer you if you want to learn more advanced topics:
Meshgrids complex grid like arrays
Broadcasting (advanced array operations)
In general Numpy is another tool in your programmers bag, whenever you encounter complex numerical data (and other types) you need to store, manipulate and operate with ( and do it fast) Numpy is a good candidate to use.
Hope this helps you get started.
Thanks for reading !