Let’s Not Forget NumPy !. Refer this blog if you are a NumPy… | by Puneet Gajwal | Nov, 2020

Coming to slicing now, an important first discussion is

Array slices are Views of the original array which means that data is not copied and modification to views will modify original arrays as well.

#Out[1] array([0 , 1, 25, 25, 25, 5, 7])

If we observe above code, modification is arr_slice is also modifying our original arr. To avoid this from happening, we need to explicitly copy array to tell numpy not to create a view. We can simply write “arr_slice=arr[2:5].copy()”.

One might think why numpy is made this way ? Simple answer is that Numpy is designed to handle very large amounts of data and if it keeps on making copies instead of views, there would be serious performance and memory issues.

Slicing : Now, lets look at few ways to Subset your array.

1: Indexing with Slices: This is the most common method for slicing your array. Like 1-D objects such as Python lists, 1-D arrays can also be sliced in familiar way. On the other hand, higher dimension arrays give you more options as you can slice one or more axes. Note that a simple colon by itself means to take entire axis.

arr_2d=np.array([[1,2,3], [4,5,6], [7,8,9]])
arr_2d[:2, 1:]
Out[1] array([2,3],[5,6])
arr_2d[:, :1]
Out[2] array([1],[4],[7])
Slicing with Index Values

I highly recommend you to try to figure out on your own how these results were achieved with slicing.

2: Boolean Indexing: You can also Slice your Numpy array with another array object of Boolean type. Consider below example

cities=np.array([‘Delhi’, ‘Pune’, ‘Mumbai’, ‘Shimla’, ‘Manali’, ‘Delhi’])
data= np.random.randn(6,4)
Out[3] array([[-0.22606324, 1.53353387, 1.51530491, 0.3718246 ],
[ 1.33571102, -0.55789854, 0.72791453, -0.06581134],
[ 0.25202173, -0.75841212, -0.80382225, 1.95401317],
[ 0.02681555, -1.20263412, -0.13671105, -0.01325391],
[ 0.25647382, -0.01122541, -0.57057744, 0.09792998],
[ 1.02348586, -0.12688505, 0.01910011, -0.88353741]])
data[cities== 'Delhi']
Out[4] array([[-0.22606324, 1.53353387, 1.51530491, 0.3718246 ],
[ 1.02348586, -0.12688505, 0.01910011, -0.88353741]])
mask = (cities=='Delhi') | (cities=='Pune')
Out[5] array([[-0.22606324, 1.53353387, 1.51530491, 0.3718246 ],
[ 1.33571102, -0.55789854, 0.72791453, -0.06581134],
[ 1.02348586, -0.12688505, 0.01910011, -0.88353741]])

Here I created an array of cities in which “Delhi” is repeated twice and a random 6×4 random normal array. Note that both arrays have 6 columns. Then I simply passed the condition cities== ‘Delhi’ which will return an array of True-False ( [True, False, False, False, False, True]. This will subset our data array where condition is True i.e. 0th and 5th index values.

You can further create masks based on your customized conditions and pass that mask to Slice your data array.

3: Take, Put and Fancy Indexing: Fancy indexing is a term adopted by Numpy which simply means slicing via integer arrays.

One thing to keep in mind is that Fancy Indexing always copies data to the newly created array which is exactly opposite of what Slicing does.

Out[6] array(['Shimla', 'Pune', 'Delhi'], dtype='<U6')

We can also use negative indices to fetch our data. We can also pass 2 arrays to Slice 1-D matrix from a 2-D matrix as shown.

Out[7] array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
Out[8] array([ 4, 23, 13, 10])

There’s an alternate for making selections on a single axis with “take” and “put” methods which are computationally faster than Fancy Indexing.

# Fancy Index
%timeit arr[ind]
Out[9] 32.8 µs ± 131 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit arr[ind]
Out[10] 26.3 µs ± 89.9 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# puts 2 in place of first 1000 values
arr.put(ind, 2)

Remember, take and put are just equivalents of Fancy Indexing. When you need to set elements using an index array on other axis, fancy indexing is what you should go with.


More Posts

Send Us A Message