start
This is an old revision of the document!
Welcome to the Data Science Lab Cluster
This computation resource is a cluster of workstations that can work together as one big systems. Currently, the system can run large Hadoop and Spark jobs. There are also three GPU equipped nodes that are configured to run TensorFlow.
FOR HELP CLICK ON THE "How Do I" LINK BELOW
HINT: To get back to this main page from any page in the wiki, click on the Data Science Lab in the upper left corner.
System News:
Aug-30-2019 Python options are now (default, V-2.6.6) or Ananaconda (V-3.7.1). Zeppelin now supports Python3, PySpark, Spark1, Spark2, and R. See the "How Do I" page for information. Feb-20-2019 Python Anaconda is now available, see the "How Do I" page for information on how to access it. Nov-27-2018 Default Spark version is now 2.1.0, default Pyspark uses Python 3.6.3 Nov-07-2018: The Zeppelin Notebook is now available, see System Access above Nov-01-2018: Python 3.6 updated on all systems with modules: numpy matplotlib TextBlob scipy scikit-learn gensim pillow h5py xgboost happtbase mysqlclient happybase (See "How Do I" for usage information) Jul-26-2018 R Studio server is installed. Enter "http://localhost:8787" in a browser to access. May-02-2018 R Libraries: See /opt/share/doc/Installing-R-Libraries for how to install your own R libraries. Apr-20-2018: Python HBase lib HappyBase installed. Tensorflow now running on Limulus8-TF and Limulus9-TF Feb-21-2018: A current Wikipedia snapshot is in HDFS at /data/Wikipedia Feb-19-2018: HDFS is now available on all limulus machines as /mnt/hdfs Annotated examples from the purple Hadoop book are in /opt/share/doc/Hadoop2_Quick_Start_V1 Feb-14-2018: The following Python 2.7 modules are installed: nltk, keras, numpy, pandas, matplotlib TextBlob, scipy, Tensorflow, scikit-learn, gensim, pillow, h5py !!! Be sure to run "scl enable devtoolset-6 python27 bash" to use Python 2.7 Feb-09-2018: All external ssh connections will close after 30 minutes of inactivity. Internal ssh connections (machine to machine) will close after 1 hour of inactivity.
start.1567625541.txt.gz · Last modified: 2019/09/04 19:32 by deadline