Both sides previous revisionPrevious revisionNext revision | Previous revision |
start [2022/01/21 15:46] – [DSL LAB IS OPEN] deadline | start [2022/03/31 15:51] (current) – deadline |
---|
| |
* [[how_do_i#using_python|Python Anaconda]] is installed | * [[how_do_i#using_python|Python Anaconda]] is installed |
* [[how_do_i#r_studio|RStuduio]] is installed | * [[how_do_i#r_studio|RStudio]] is installed |
| * [[how_do_i#using_the_zeppelin_web_notebook|Zeppelin Notebooks]] are now full configured for Python, PySpark, Spark, Hive, and shell programming. A notebook called **Basic Tests (Python, PySpark, sh, and Hive)** is available for learning more about Zeppelin (clone first). |
| * [[how_do_i#transfer_files_to_from_the_cluster|Transferring Files from the Cloud]] has been added. The [[using_rclone|rclone]] package has been installed on all workstations (rclone is a command line tool) |
| * [[how_do_i#use_tensorflow|Python Tensorflow]] (CPU and GPU) and Keras are installed |
**Watch this space for updates.** | **Watch this space for updates.** |
| |
====About The System==== | ====About The System==== |
| |
This computation resource is a cluster of workstations that can work together as one big systems. The system can run large Hadoop and Spark jobs using the 10 TByte Hadoop Distributed File System (HDFS) and up to 120 cores. There are also three GPU equipped nodes that will be configured to run TensorFlow. | This computation resource is a collection of nine individual workstations that can work together as a scalable data science cluster for Big Data processing. The system can run large Hadoop and Spark jobs using the 10 TByte Hadoop Distributed File System (HDFS) on up to 120 cores. There are also three GPU equipped nodes that are configured to run TensorFlow. Total system memory is 600 GBytes spread across |
| 30 separate motherboards. |
| |
| Each workstation provides a Linux desktop environment that supports Anaconda Navigator (Python), Rstudio, and the Zeppelin web notebook (Spark, PySpark, Hadoop Hive,HBase, Python) |
| |
====FOR HELP CLICK ON THE "How Do I" LINK BELOW==== | ====FOR HELP CLICK ON THE "How Do I" LINK BELOW==== |
| |
**System News:** | **System News:** |
Jan-21-2022. Anaconda Python and RStudio installed | Feb-18-2022 Python Tensorflow (CPU and GPU) and Keras installed |
Jan-20-2022. System is ready for users | Feb-14-2022 Zeppelin Notebooks are configured and rclone installed |
| Feb-07-2022 Anaconda Navigator |
| Jan-21-2022 Anaconda Python and RStudio installed |
| Jan-20-2022 System is ready for users |
Nov-11-2012 Upgrade to CentOS 7 in progress | Nov-11-2012 Upgrade to CentOS 7 in progress |
| ---- OLD SYSTEM ---- |
Aug-30-2019 Python options are now (default, V-2.6.6) or Ananaconda (V-3.7.1). Zeppelin now supports Python3, | Aug-30-2019 Python options are now (default, V-2.6.6) or Ananaconda (V-3.7.1). Zeppelin now supports Python3, |
PySpark, Spark1, Spark2, and SparkR. See the "How Do I" page for information. | PySpark, Spark1, Spark2, and SparkR. See the "How Do I" page for information. |