Skip to main content

Day 2 Inquiry on Courses, Intro to R and Master's in Data Science

Checked online certain universities for Data Analysis or Data Sciences or Machine Learning or Computational Sciences courses. It isn't so widespread in UAE , sent some mails and inquired with some recently LinkedIn contacts.

Check this link for lot of good information on the path
http://www.mastersindatascience.org/careers/data-scientist/

Now on the learning R , seems cool esp. the vector factor , slowly I correlate similar data (eg. nos) and then classifying the similar data on a pattern (eg. days) and doing certain analysis on them to extract information.
I have taken this course. https://campus.datacamp.com/courses/free-introduction-to-r

What Kind of Skills Will I Need?

Technical Skills

  • Math (e.g. linear algebra, calculus and probability)
  • Statistics (e.g. hypothesis testing and summary statistics)
  • Machine learning tools and techniques (e.g. k-nearest neighbors, random forests, ensemble methods, etc.)
  • Software engineering skills (e.g. distributed computing, algorithms and data structures)
  • Data mining
  • Data cleaning and munging
  • Data visualization (e.g. ggplot and d3.js) and reporting techniques
  • Unstructured data techniques
  • R and/or SAS languages
  • SQL databases and database querying languages
  • Python (most common), C/C++ Java, Perl
  • Big data platforms like Hadoop, Hive & Pig
  • Cloud tools like Amazon S3

What About Certifications?

To avoid wasting time on poor quality certifications, ask your mentors for advice, check job listing requirements and consult articles like Tom’s IT Pro “Best Of” certification lists. Here are a few that focus on useful skills:

Certified Analytics Professional (CAP)

CAP was created in 2013 by the Institute for Operations Research and the Management Sciences (INFORMS) and is targeted towards data scientists. During the certification exam, candidates must demonstrate their expertise of the end-to-end analytics process. This includes the framing of business and analytics problems, data and methodology, model building, deployment and life cycle management.
Requirements:
  • 5+ years of analytics work-related experience for BA/BS holder in a related area
  • 3+ years of analytics work-related experience for MA/MS (or higher) holder in a related area
  • 7+ years of analytics work-related experience for BA/BS (or higher) holder in an unrelated area
  • Verification of soft skills/provision of business value by employer
  • Agreement to adhere to Code of Ethics

Cloudera Certified Professional: Data Scientist (CCP:DS)

Targeted towards the elite level, the CCP:DS is aimed at data scientists who can demonstrate advanced skills in working with big data. Candidates are drilled in 3 exams – Descriptive and Inferential Statistics, Unsupervised Machine Learning and Supervised Machine Learning – and must prove their chops by designing and developing a production-ready data science solution under real-world conditions.
Related Cloudera certifications include:

EMC: Data Science Associate (EMCDSA)

The EMCDSA certification tests your ability to apply common techniques and tools required for big data analytics. Candidates are judged on their technical expertise (e.g. employing open source tools such as “R”, Hadoop, and Postgres, etc.) and their business acumen (e.g. telling a compelling story with the data to drive business action).
Once you’ve passed the EMCDSA, you can consider the Advanced Analytics Specialty. This works on developing new skills in areas such as Hadoop (and Pig, Hive, HBase), Social Network Analysis, Natural Language Processing, data visualization methods and more.

SAS Certified Predictive Modeler using SAS Enterprise Miner 7

This certification is designed for SAS Enterprise Miner users who perform predictive analytics. Candidates must have a deep, practical understanding of the functionalities for predictive modeling available in SAS Enterprise Miner 7 before they can take the performance-based exam. This exam includes topics such as data preparation, predictive models, model assessment and scoring and implementation.
Related SAS certifications include:

Comments

Popular Reads

Day 15 GCP Recommendations , Cloud SQL PySpark DataProc

Collaborative Filtering - RDD-based API https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html PySpark https://spark.apache.org/docs/0.9.0/python-programming-guide.html Managed Hadoop & Spark https://cloud.google.com/dataproc/ Fully-Managed PostgreSQL  BETA  & MySQL https://cloud.google.com/sql/ Cloud sql can run 1 petabit per second

Day 18 GCP DataLab, Big Query from Client Side , Pandas- Python

'sudo su -' vs 'sudo -i' vs 'sudo /bin/bash' - when does it matter which is used, or does it matter at all? https://askubuntu.com/questions/376199/sudo-su-vs-sudo-i-vs-sudo-bin-bash-when-does-it-matter-which-is-used docker ps  will show only running containers by default. To see all containers:  docker ps -a https://docs.docker.com/v1.11/engine/reference/commandline/ps/ https://8081-dot-2337103-dot-devshell.appspot.com/tree/datalab root1234 - paraphrase DataLab gives the ability to share a notebook with other people , at the same time use the cloud for computing n storage. https://cloud.google.com/bigquery/docs/reference/libraries#client-libraries-usage-python https://github.com/google/google-api-javascript-client http://stackoverflow.com/questions/12479895/obtaining-bigquery-data-from-javascript-code Python Data Analysis Library http://pandas.pydata.org/

Day 13 - Microsoft artificial Intelligence | Arduino Robotic Kit

Using Visual Studio for python -- back to debugging mode , feels amazing. So finally after managing to get 4 18650 batteries , 2 of them are discharged , now not sure how to recharge them. Good Resources for AI https://blog.goodaudience.com/learn-ai-for-free-5b186cde3990 Python Math Operators https://www.digitalocean.com/community/tutorials/how-to-do-math-in-python-3-with-operators Reading an Input form cmd line with python https://stackoverflow.com/questions/70797/user-input-and-command-line-arguments Convert string to int in Python https://guide.freecodecamp.org/python/how-to-convert-strings-into-integers-in-python/ https://www.programiz.com/python-programming/if-elif-else

Day 20 Google APIs, Google Application Default Credentials

Searching for objects attribute value, it has to be Datastore . Remember that BigTable, you can only search by key.  High-throughput writes of wide-column data. Well, that is BigTable , right, because it's supporting high-throughput writes.  Warehousing structured data. So what's the data warehouse technology on Google Cloud? That's, which one, BigQuery .  To create and test new machine learning methods. Well, if you're writing new machine learning methods, then TensorFlow .  Develop Big Data algorithms interactively in Python.Well, interactive development in Python is done best with Datalab .   Well, interactive No-ops, custom machine learning applications at scale. No-ops ML at scale, then that's a role for Cloud ML.  Automatically reject inappropriate image content. Rejecting image content where it is inappropriate. Well, that could be the Vision API. So you could use a Vision API to basically see if this is safe content or not safe content. 

Day 7 Hands on with Git Repo and RStudio , R 101 BigDataUniversity

Removing a remote http://stackoverflow.com/questions/9224754/how-to-remove-origin-from-git-repository Kickstarting   R  - Writing R scripts https://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_scrpt.html Source on Save https://support.rstudio.com/hc/en-us/articles/200484448-Editing-and-Executing-Code Ctrl+L  — Clear the Console https://support.rstudio.com/hc/en-us/articles/200404846-Working-in-the-Console User Defined Functions in R http://www.statmethods.net/management/userfunctions.html Issue pushing new code in Github http://stackoverflow.com/questions/20939648/issue-pushing-new-code-in-github Git refusing to merge unrelated histories http://stackoverflow.com/questions/37937984/git-refusing-to-merge-unrelated-histories

Day 8 - Microsoft AI | Arduino Robotic Kit

Arduino Robotic Kit https://photos.app.goo.gl/Ng6MmaRnxHHWUHpJ6 Just got my Arduino robotic kit , quite excited :-) , Let us start exploring the various components. HC SR 04 Ultrasonic Ranging Module https://www.mouser.com/ds/2/813/HCSR04-1022824.pdf HC - SR04 provides 2cm - 400cm non-contact measurement function, the ranging accuracy can reach to 3mm Vcc , Trigger , Echo n Ground  You only need to supply a short 10uS pulse to the trigger input to start the ranging, and then the module will send out an 8 cycle burst of ultrasound at 40 kHz and raise its echo. The Echo is a distance object that is pulse width and the range in proportion https://randomnerdtutorials.com/complete-guide-for-ultrasonic-sensor-hc-sr04/ https://www.arduino.cc/en/Reference/Libraries https://www.youtube.com/watch?v=6F1B_N6LuKw

Day 5 IBM Watson , DataCamp Purchase, IT Pros Attack n RedMonk rankings

IBM Watson http://www.jenunderwood.com/2017/03/28/ibm-watson-cognitive-computing/?utm_content=bufferb4ded&utm_medium=social&utm_source=facebook.com&utm_campaign=buffer The DataCamp Intro to R is free but later courses are paid. Will try complete Coursera first and get some heads on and then I can buy DataCamp courses as well. https://www.tripwire.com/state-of-security/featured/90-pros-expect-attacks-risk-vulnerability-iiot-2017/ http://redmonk.com/sogrady/2017/03/17/language-rankings-1-17/

Day 14 Swirl

install.packages("swirl") https://github.com/swirldev/swirl_courses#swirl-courses https://en.wikipedia.org/wiki/YAML http://yaml.org/ | You can exit swirl and return to the R prompt (>) at any time by pressing the Esc key. If you are | already at the prompt, type bye() to exit and save your progress. When you exit properly, you'll see a | short message letting you know you've done so. | When you are at the R prompt (>): | -- Typing skip() allows you to skip the current question. | -- Typing play() lets you experiment with R on your own; swirl will ignore what you do... | -- UNTIL you type nxt() which will regain swirl's attention. | -- Typing bye() causes swirl to exit. Your progress will be saved. | -- Typing main() returns you to swirl's main menu. | -- Typing info() displays these options again. | Let's get started! sqrt() function and to take the absolute value, use the abs() function Vector of unequal length Artihmetic Op

Helplessness

So we had no project and to add the to misery no jobs.We were told by our placement dept tht our dept so called MEDICAL ELECTRONICS wasn't allowed by most of the companies. We all wer dejected as all of us had worked hard and wanted to get some job.I tht why sit n crib all time when u have so much time to enjoy. So i hit wit my family to Rajasthan.It was a nice but extremely hectic trip for me. We went by flight to jaipur then next day jodhpur by bus the same day udaipur by bus ,whr i came to know tht companies wer allowing us n the placement dept scammed us. So next day we left frm udaipur to mumbai  by train n then the same day to bangalore by bus. In 5 days i had spanned across 5 states. I rechd bangalore and had accenture next day so went thro the placement bible RS Aggarwal. Sadly i didn't clear then i sat for HCL which again i didn't clear so i was amidst doubt within myself. But still i didn't lose hope n worked harder.

Day 25 R Programming

https://stackoverflow.com/questions/18222286/dynamically-select-data-frame-columns-using-and-a-vector-of-column-names https://stackoverflow.com/questions/12614953/how-to-create-a-numeric-vector-of-zero-length-in-r https://stackoverflow.com/questions/7355187/error-in-if-while-condition-missing-value-where-true-false-needed http://www.dummies.com/programming/r/how-to-add-observations-to-a-data-frame-in-r/ https://stackoverflow.com/questions/11561856/add-new-row-to-dataframe-at-specific-row-index-not-appended https://stackoverflow.com/questions/22235809/append-value-to-empty-vector-in-r