Having issues with Jupyter Notebook

So I was trying to follow along a course with Jupyter Notebook (which was made in 2020) and I ran into a couple of errors. Most were some I could easily solved because they were simply deprecated/removed commands that I could easily replace. But with these two errors, I have no idea what the real issue is, and seeing past people's questions and answers has not helped me, so I'm bringing it here.

First error

Description: I have saved a music recommender model using joblib and I am trying to call the saved model to make a prediction based on age (21) and gender (1, which represents male)

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
import joblib

Code I commented out
#musicdata = pd.read_csv('musicgenres.csv')
#X = musicdata.drop(columns=['Genre'])
#y = musicdata['Genre']

#model = DecisionTreeClassifier()
#model.fit(X, y)
model = joblib.dump(model, 'music-recommender.joblib')
predictions = model.predict([[21, 1]])
predictions

Error message:
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10056/639016337.py in
14
15 model = joblib.dump(model, ‘music-recommender.joblib’)
—> 16 predictions = model.predict(21)
17 predictions

AttributeError: ‘list’ object has no attribute ‘predict’

Second error

Description: The course didn't explain to me the purpose of any of this code on purpose yet, but it should output a visual representation of decision trees.

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree

musicdata = pd.read_csv('musicgenres.csv')
X = musicdata.drop(columns=['Genre'])
y = musicdata['Genre']

model = DecisionTreeClassifier()
model.fit(X, y)

tree.export_graphviz(model, out_file='music-recommender.dot',
                                  feature_names= ['age', 'gender'],
                                  class_name=sorted(y.unique()),
                                  label='all',
                                  rounded=True,
                                  filled=True)

Error message:
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10056/3981914369.py in
13 model.fit(X, y)
14
—> 15 tree.export_graphviz(model, out_file=‘music-recommender.dot’,
16 feature_names= [‘age’, ‘gender’],
17 class_name=sorted(y.unique()),

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
—> 63 return f(*args, **kwargs)
64
65 # extra_args > 0

TypeError: export_graphviz() got an unexpected keyword argument ‘class_name’


If anybody knows what I'm doing wrong or what I should write in my cell instead, please tell me.

These questions are not in the scope of this forum. These seem to be machine learning in the Python ecosystem questions or questions about Python syntax. You may want to check the course resources as to where to address such questions. Machine learning is a hot topic and it may just be that the underlying modules and methods you have used have evolved slightly. Check your code very carefully when debugging. (That being said, there are a lot of Python experts hanging around here and you may be lucky enough someone will instantly see what is the issue. This is probably not a good strategy though.)

In the future, one way to think about whether questions fit this forum is to ask yourself if you ran these as a traditional Python script on the command line, would you see the same thing. It seems in these two cases you would.

Your course is using Jupyter notebook as a way to run Python and do machine learning in a way that makes the code and output easy to organize, analyze, change, and rerun. That’s great but hopefully they also showed you how to run Python code separate from Jupyter. If not, that is something you could learn about to help you troubleshoot the sources of your issues.

2 Likes

@fomightez is correct, this is out of scope for this forum. However, since we tend to be a curious bunch, the following might get you moving forward.

First error: Looking at the joblib docs, joblib.dump(model, 'music-recommender.joblib') persists the model object to a file named ‘music-recommender.joblib’ and returns a list of filenames in which the object was persisted. As a result, your code has replaced the original object (model) with a list of strings (and doesn’t include a predict attribute/method). Since joblib.load appears to be the reflective operation to dump(), you might consider:

joblib.dump(model, 'music-recommender.joblib')
model = joblib.load('music-recommender.joblib')
predictions = model.predict([[21, 1]])

or simply remove the return value from the call to joblib.dump() so as to keep the object intact. Not sure what the persistence is buying you, but I suspect, since this is a course, you missed the corresponding load() call.

Second error: Again, looking at the docs for export_graphwiz(), this is a typo. class_name should be plural: class_names.

I hope this helps. Wayne’s tip on using Python directly as a litmus test for whether a programming issue is warranted in this forum is spot-on. If the issue occurs in a notebook’s execution, but NOT in the Python directly (or via its REPL), then this is the place to ask.

Take care.

3 Likes

Thank you for the feedback. Unfortunately, the course did not tell me anything about testing my code outside of Jupyter Notebook but I will keep that in mind in the future.

1 Like