Installing scikit-learn, panda, scipy, numpy on a Mac with M1
Before installing scikit-learn, you should install its dependencies.
Installing Pandas
Pandas installs fine with pip install pandas
. However, when I tried to import it in my Jupyter Lab notebook, it crashed with this error:
ValueError Traceback (most recent call last)
<ipython-input-2-b6c7f9fc9652> in <module>
----> 1 import pandas as pd
~/miniforge3/envs/tf25/lib/python3.9/site-packages/pandas/__init__.py in <module>
27
28 try:
---> 29 from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib
30 except ImportError as e: # pragma: no cover
31 # hack but overkill to use re
~/miniforge3/envs/tf25/lib/python3.9/site-packages/pandas/_libs/__init__.py in <module>
11
12
---> 13 from pandas._libs.interval import Interval
14 from pandas._libs.tslibs import (
15 NaT,
pandas/_libs/interval.pyx in init pandas._libs.interval()
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
This StackOverflow post recommends to upgrade numpy to 1.20+, but since I am using TensorFlow, I am stuck with 1.19.5. They mention a GitHub ticket which expands on the solutions. One simple solution that could work is to run pip install with some additional flags (--no-cache-dir --no-binary :all:
) that supposedly compiles the package you are trying to install using the local version of numpy.
Another person suggests using older packages. I had installed Pandas 1.2.4. They're using Pandas 1.1.2. Incidentally they were using numpy 1.20. I looked for a compatibility table that would tell me which versions of Panda support numpy<1.20. I looked at https://pypi.org/project/pandas/, https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html and other pages to no avail.
Eventually I tried to just install the mentioned version with pip and it failed:
% pip install pandas==1.1.2
Collecting pandas==1.1.2
Downloading pandas-1.1.2.tar.gz (5.2 MB)
|████████████████████████████████| 5.2 MB 2.1 MB/s
Installing build dependencies ... error
ERROR: Command errored out with exit status 1:
command: /Users/anhtuan/miniforge3/envs/tf25/bin/python3.9 /private/var/folders/ym/2b7pw1yn0v71ybqwb07t4vw00000gn/T/pip-standalone-pip-7my5a2_g/__env_pip__.zip/pip install --ignore-installed --no-user --prefix /private/var/folders/ym/2b7pw1yn0v71ybqwb07t4vw00000gn/T/pip-build-env-8qv_ct5z/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel 'Cython>=0.29.16,<3' 'numpy==1.15.4; python_version=='"'"'3.6'"'"' and platform_system!='"'"'AIX'"'"'' 'numpy==1.15.4; python_version=='"'"'3.7'"'"' and platform_system!='"'"'AIX'"'"'' 'numpy==1.17.3; python_version>='"'"'3.8'"'"' and platform_system!='"'"'AIX'"'"'' 'numpy==1.16.0; python_version=='"'"'3.6'"'"' and platform_system=='"'"'AIX'"'"'' 'numpy==1.16.0; python_version=='"'"'3.7'"'"' and platform_system=='"'"'AIX'"'"'' 'numpy==1.17.3; python_version>='"'"'3.8'"'"' and platform_system=='"'"'AIX'"'"''
...
many more angry red lines
So I tried with conda install
:
% conda install pandas==1.1.2
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
PackagesNotFoundError: The following packages are not available from current channels:
- pandas==1.1.2
Current channels:
- https://conda.anaconda.org/conda-forge/osx-arm64
- https://conda.anaconda.org/conda-forge/noarch
But it wasn't available. On https://conda.anaconda.org/conda-forge/osx-arm64/ I saw that pandas was available from version 1.1.3 onwards, so I installed that one:
conda install pandas==1.1.3
And this time it worked. import pandas
ran fine and I was able to use the library to parse some CSV file.
Things I could try further:
pip install
the latest version with the custom flags.conda install pandas
with a more recent version.
Installing scipy and numpy
If you install these with pip, it fails with thousands of lines of red logs. To install it on a Mac with M1, you have to use Conda instead.
% conda install scikit-learn
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /Users/anhtuan/miniforge3/envs/tf25
added / updated specs:
- scikit-learn
The following packages will be downloaded:
package | build
---------------------------|-----------------
scikit-learn-0.24.2 | py39hab69601_0 6.6 MB conda-forge
------------------------------------------------------------
Total: 6.6 MB
The following NEW packages will be INSTALLED:
joblib conda-forge/noarch::joblib-1.0.1-pyhd8ed1ab_0
scikit-learn conda-forge/osx-arm64::scikit-learn-0.24.2-py39hab69601_0
threadpoolctl conda-forge/noarch::threadpoolctl-2.1.0-pyh5ca1d4c_0
Proceed ([y]/n)? y
Downloading and Extracting Packages
scikit-learn-0.24.2 | 6.6 MB | ####################################################################################################################################################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done