Troubleshoting with calculating median by year

Hey!
I new to the Jupyter environment. I’ve been trying to calculate the median by years (in my excel I’ve two columns, date (year, month, day) and sales.) .

When I try to run the cell where the code it shows the following:


AxisError Traceback (most recent call last)
in
----> 1 yhat = median (-12, -24, -36)

<array_function internals> in median(*args, **kwargs)

~/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/function_base.py in median(a, axis, out, overwrite_input, keepdims)
3604
3605 “”"
→ 3606 r, k = _ureduce(a, func=_median, axis=axis, out=out,
3607 overwrite_input=overwrite_input)
3608 if keepdims:

~/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/function_base.py in _ureduce(a, func, **kwargs)
3493 keepdim = list(a.shape)
3494 nd = a.ndim
→ 3495 axis = _nx.normalize_axis_tuple(axis, nd)
3496
3497 for ax in axis:

~/opt/anaconda3/lib/python3.8/site-packages/numpy/core/numeric.py in normalize_axis_tuple(axis, ndim, argname, allow_duplicate)
1389 pass
1390 # Going via an iterator directly is slower than via list comprehension.
→ 1391 axis = tuple([normalize_axis_index(ax, ndim, argname) for ax in axis])
1392 if not allow_duplicate and len(set(axis)) != len(axis):
1393 if argname:

~/opt/anaconda3/lib/python3.8/site-packages/numpy/core/numeric.py in (.0)
1389 pass
1390 # Going via an iterator directly is slower than via list comprehension.
→ 1391 axis = tuple([normalize_axis_index(ax, ndim, argname) for ax in axis])
1392 if not allow_duplicate and len(set(axis)) != len(axis):
1393 if argname:

AxisError: axis -24 is out of bounds for array of dimension 0

Thanks for your time!

Tutorial I’m following

I’m going to suggest you read about what Jupyter is as part of this excellent answer here by @krassowski because I feel you have a similar issue where you are meaning to ask a Python question. (Yours is a different question though.) You’d have a better luck searching for the answer on the internet using Python as a search term and not Jupyter.
One way to think about, it is to ask yourself if I was running this code in a traditional Python script and not a notebook, would I have the same issue. If you’d have the same issue with your code in a script, then you have a Python question and not a Jupyter question.


That being said. I think if you ran the first two lines of this tutorial, which are:

# load
series = read_csv('monthly-car-sales.csv', header=0, index_col=0)

You found that gave an error, right? And only later in the tutorial was the rest of the code provided to make the load step work because first from pandas import read_csv was needed to be run.

I think something similar is going on here. Things are being shown out of order and incomplete in that line that is giving you an issue.
Somewhere along the way where the median method is coming from got left off.
Also I am not seeing how those indices (-12, -24, -36) would be referring to the series that was read in earlier.

In other words, it’s not you. Looks like flawed code. Take the author’s word for it and move on. The author describes what they meant to calculate pretty well. Maybe someday you’ll see something similar and be able to piece it back together correctly.

Given you seem interested in Jupyter, I’ll add that what you stumbled upon is why it is better to look for tutorials already built into Jupyter notebooks and shared that way. The ability to Run All cells means less likely pieces will get pulled apart and posted out of order or incomplete, as it looks like that site you reference has done.

2 Likes