Upper cell dependencies

Lyle · June 18, 2020, 11:45pm

Hi there, newbie here. Been learning Python and Jupyter over these last few covid months. Getting pretty good at it and really enjoying the learning.

Sorry if I get the terminology wrong in my question, but here goes.

So let’s say I’ve got multiple cells. I’ve done a calculation up above and have a new variable (or a new value has been assigned to a variable). MOST of the time, two cells down I can access that variable fine as long as I’ve run the upper cell at least once. But SOMETIMES it won’t let me. The only way to access that variable and run my cell is to run that upper cell every time before running the cell in question (or “run all”).

Is there some logic to this that I’m missing? I haven’t been able to discover a pattern so far.

Thanks in advance for your input. It’s just one of those quality of life things that I would love to solve!

MSeal · June 19, 2020, 5:57pm

Hi Lyle,

Generally think of the notebook cells as all saving to the same global variable space in the notebook. You can actually check these by calling locals() outside a function – or globals() inside a function.

But SOMETIMES it won’t let me.

This seems odd. Either A) you deleting the global variable or B) you restarted the kernel (the thing actually running your python code) in which case the variable would no longer be available or C) You have a local variable inside the function you are using that is overloading the parent and the value is making you think it doesn’t exist (e.g. None).

You may want to create the simplest example that reproduces the issue and that will either quickly reveal the problem as you remove surrounding code snippets, or give a clearer picture about what execution patterns you are using here.

Lyle · June 19, 2020, 11:45pm

Thanks for getting back to me!

OK, good to know that it supposed to work the way I thought it was supposed to work. I guess I wanted to make sure I wasn’t missing something fundamental before I spent more time researching it.

It will definitely be my mission now to figure it out the next time it happens. I like your idea of removing code until I narrow down what is causing the problem. I will report back!

Lyle · July 7, 2020, 1:24pm

Hi, there. I finally found some time to do further research and can now report back.

Below are two simplified cells of code to demonstrate what happened to me:
[python]
df1=pd.read_csv(‘filename.csv’)
#code
[/python]
[python]
df2=df1
df2[‘col_new’]=df1[‘colA’].rolling(12).sum().round(0)
df2.drop([‘colA’],axis=1, inplace=True)
#code
[/python]

In this case, I could only run the second cell after running the first cell. Every time.

What I eventually found was that the .drop() method was dropping colA from BOTH dataframes. So the next time this cell was run, it would choke on the col_new assignment, 'cause it couldn’t find colA.

I don’t understand why this is happening, but my solution was to use the .copy() method which fixed this problem:
[python]
df2=df1.copy()
[/python]

fomightez · July 7, 2020, 3:07pm

Yes, that wasn’t a Jupyter issue. You would find the same thing running Python in a console/command line. It is a basic Python issue.

It is something you’ll want to understand going forward but you stumbled upon the solution already.
Your line df2=df1 is the issue here and has nothing to do with Jupyter cells. That isn’t a good practice because you aren’t copying the dataframe. You’ll encounter the same problem with other Python data types, such as a list, if you do that type of assignment. This stackoverflow answer covers it for lists, but the same concept holds for other data types, like your dataframe.
Your line df2=df1, just copies the reference to the dataframe, not the actual dataframe so that both df2 and df1 refer to the same dataframe after the assignment. So you thought your line df2.drop([‘colA’],axis=1, inplace=True) was just dropping that column in df2; however, it was also dropping it in the dataframe df1 as well. And that came as a surprise to you because you didn’t realize yet both df2 and df1 were referencing the same dataframe object. Copying lists and dataframes is the way to go when you want to maintain the original but do operations on another.

Lyle · July 8, 2020, 1:34am

Got it! Thanks for the excellent explanation!!

astar · January 31, 2025, 11:09am

Hi - I’ve had this, I initially thought it was a “feature” of Jupyter notebooks and that scoping to individual cells was deliberate. Seems it’s not? I’ve just tried again and the variables declared in initial cells can now be accessed by later cells. I wonder if it’s a bug introduced by not having discarded the originally created ‘untitled’ notebook when having ‘saved as’.

fomightez · January 31, 2025, 5:17pm

Do you mean using Save Notebook As... from under File?

To keep the kernel active, I would suggest you should be using Rename... (or Rename Notebook in JupyterLab) in the case you seem to be describing where you start with 'Untitled.ipynb` and have entered and run some code.

Save Notebook As... in my experience makes a new notebook with that initiating source content, but with a new kernel session. You should be able to see that by making a new cell after carrying out the process you describe and running 2 + 2. It should show that is the first cell run by indicating a [1] in the brackets to the left of the cell. So if it is a new kernel, there is no past assigned variables/state. In other words, it is not linked to the original source kernel when doing that.

Rename... (or Rename Notebook in JupyterLab) keeps the kernel namespace active so you should be able to continue on working with the current state.

Topic		Replies	Views
Issue with execution of cell containing input variable Notebook	1	144	June 13, 2024
Cell missing but content still runs General community , communication , jupyterhub , how-to , help-wanted	1	833	November 10, 2020
Running cells one by one does not work seamlessly JupyterLab jupyterlab , help-wanted	0	660	April 13, 2022
HELP! My Jupyter notebook is not showing any outputs. : ( General	19	56971	October 13, 2024
Anyone with the same bug? JupyterLab question	1	28	January 5, 2025

Upper cell dependencies

Related topics