This is not a Jupyter-related issue. Think of it this way, if you ran all your code as a Python script, you’d have the same issues. Ideally, you’d seek a Python resource to post about this.
Ideally, if someone wanted to help you anywhere, you’d want to help them do it by making things easy. One way to do that is to post a minimum reproducible example, see here. Often while working out the minimum reproducible example, you’ll see where you went wrong. The notebooks you linked too are rather large, so that it is not readily clear how copy3 and 5 differ? In your post, you reference ‘lookup-file code’. Do you mean
lookup_file_list? See how you aren’t helping others help you by reference the python object or variable name with the issue in your post? You didn’t even post a real example of your file names in your post.
It just looks like you aren’t parsing your file names very well.
Consider the following line:
if (i[-16:-6] == collection_date) and (i[-4:]=='.csv'):
i[-16:-6] looks wrong to me because isn’t your parsing attempt going to break when you have more than single digits after the character
_ that is just in front of the
.csv extension? Is your parsing code so complex because some files with not distinct or well-formatted names also also lurk in that directory? Are you familiar with the fnmatch module? Using fnmatch and
str.split() more would probably help you. For example, in the file names seen among the output in the notebooks you linked to, you can easily get the collection_date with
i.split("_") so that you avoid