We use pandas for numerous data analysis tasks in our backend; however while setting up Jenkins for automated build and test I came across an issue with installing requirements in a low memory environment.
Issue overview
(.pyenv)[root@ip-172-30-0-59 MyAPI]# pip install pandas
Collecting pandas
Using cached pandas-0.18.0.tar.gz
.....
pandas/algos.c:41172:21: warning: ‘__pyx_pybuffernd_tot_wgt.diminfo[0].strides’ may be used uninitialized in this function [-Wmaybe-uninitialized]
__Pyx_LocalBuf_ND __pyx_pybuffernd_tot_wgt;
^
pandas/algos.c:41637:21: warning: ‘__pyx_pybuffernd_tot_wgt.diminfo[0].shape’ may be used uninitialized in this function [-Wmaybe-uninitialized]
} else if (unlikely(__pyx_t_20 >= __pyx_pybuffernd_tot_wgt.diminfo[0].shape)) __pyx_t_7 = 0;
^
{standard input}: Assembler messages:
{standard input}:700931: Warning: end of file not at end of a line; newline inserted
{standard input}:702091: Error: unknown pseudo-op: `.cfi'
{standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive
gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See for instructions.
error: command 'gcc' failed with exit status 4
----------------------------------------
Command "/var/lib/jenkins/workspace/MyAPI/.pyenv/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-izfM62/pandas/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-fMQFbn-record/install-record.txt --single-version-externally-managed --compile --install-headers /var/lib/jenkins/workspace/MyAPI/.pyenv/include/site/python2.7/pandas" failed with error code 1 in /tmp/pip-build-izfM62/pandas/
Here gcc is being killed because it is running out of memory while compiling the module.
Solution
Well since memory was the issue we implement a swap on our instance to cover the overages during the install of the pandas module.
mkdir -p /var/cache/swap/
dd if=/dev/zero of=/var/cache/swap/swap0 bs=1M count=512
chmod 0600 /var/cache/swap/swap0
mkswap /var/cache/swap/swap0
swapon /var/cache/swap/swap0
Now when I install pandas it actually finishes vs gcc being killed:
(.pyenv)[root@ip-172-30-0-59 MyAPI]# pip install pandas
Collecting pandas
Using cached pandas-0.18.0.tar.gz
Requirement already satisfied (use --upgrade to upgrade): python-dateutil in ./.pyenv/lib/python2.7/site-packages (from pandas)
Requirement already satisfied (use --upgrade to upgrade): pytz>=2011k in ./.pyenv/lib/python2.7/site-packages (from pandas)
Collecting numpy>=1.7.0 (from pandas)
Using cached numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied (use --upgrade to upgrade): six>=1.5 in ./.pyenv/lib/python2.7/site-packages (from python-dateutil->pandas)
Installing collected packages: numpy, pandas
Running setup.py install for pandas ... done
Successfully installed numpy pandas