Sign Up Now!

Sign up and get personalized intelligence briefing delivered daily.


Sign Up

Articles related to "data"


Microsoft Edge will notify you if your passwords have been compromised

  • Microsoft is introducing a new feature to Edge called Password Monitor so you don’t have to turn to a website like Engadget to find out one of your passwords has been compromised.
  • The next time one of the passwords you have saved to Edge is included in a third-party data breach, you’ll be notified to change it.
  • Both Chrome and Firefox have offered similar functionality for the past few years, as have Password managers like 1Password and LastPass.
  • However, what makes Microsoft’s offering intriguing then is that it’s using a relatively new type of approach called homomorphic encryption to ensure no one at the company, nor any other party, can find out your passwords.
  • It also works with a variety of machines, including those with relatively old CPUs, so that anyone can take advantage of Password Monitor.

save | comments | report | share on


How to Combine Animated Plots in R

  • In this tutorial, I’m going to outline the steps necessary to create an animated, faceted plot in R.
  • Although rare, combining animated plots can be a powerful way to showcase different elements of the same data (as you’ll see below).
  • A faceted, animated plot is a great option because we’d like to observe the magnitude of these differences and how these differences have evolved over time.
  • The first visualization we’ll create for the final output is a dumbbell plot.
  • First, we can create a static visualization using ggalt (again, my blog post covers the details of this step).
  • But in our case, we’d like to include another GIF: a line chart of differences over time.
  • Finally, we’ll combine them using magick (thanks to this post).
  • This is where you’ll find the code necessary to do the combination (no matter what your animated plots look like, this should work!).

save | comments | report | share on


Best data science tools for academia in 2021

  • Consequently, in this article, the best tools for an academic data scientist in 2021 are presented.
  • There are plenty of options for Machine Learning as a Service (MLaaS) to train models on the cloud, such as Amazon SageMaker, Microsoft Azure ML Studio, IBM Watson ML Model Builder and Google Cloud AutoML.
  • A few years ago probably Microsoft Azure was the best since it offered services such as anomaly detection, recommendations and ranking, which back then Amazon, Google and IBM did not provide.
  • Tools like PyCharm and Visual Studio Code are almost standard for Python Development.
  • Many times before I started using Anaconda I would encounter all sorts of issues trying to use scripts that were developed with a specific version of packages like NumPy and pandas.
  • Overall this article recommends implementing tools like Anaconda and Jupyter.

save | comments | report | share on


How to Create a Beautify Combo Chart in Python Plotly

  • The finished combo chart will look like this.
  • It can be clearly seen that the font of the axis labels on the horizontal and vertical coordinates is a little bit small, It is hard to read.
  • And the scale line is not very helpful in pinpointing the real number for each bar.
  • The first step is to enlarge the font of the axis labels and remove the scale line.
  • Titles need to be added to provide information to the end users about what is the topic of the chart.
  • The bars will become light blue color surrounded by lines to enhance the visibility of bars.
  • There are still some small adjustments required to make the chart more informative and less confusing.
  • Labels and title are also informative.
  • Grid lines is helpful in pinpointing the actual numbers of each bars.

save | comments | report | share on


Advanced Options with Hyperopt for Tuning Hyperparameters in Neural Networks

  • We start with specifying how many data points we need, as well as the model parameters (feel free to change these around to see a different model response — maybe you want to simulate a race car by simulating a fast acceleration, or a higher maximum speed by increasing the gain, or Kₚ).
  • We want to create a machine learning model that simulates similar behavior, and then use Hyperopt to get the best hyperparameters.
  • The last step is to return information we might want to use later in the code, such as the loss of our objective function, the Keras model, and the hyperparameter values.
  • We pass the f_nn function we provided earlier, the space containing the range of hyperparameter values, define the algo as tpe.suggest, and specify the max_evals as the number of sets we want to try.

save | comments | report | share on


Making a Better Filled Map

  • In instances where it’s important to have the exact continuous variable represented, start by considering whether there’s direction to the variable and if this can influence color choice to improve perceptions — is one outcome better than the other (green vs.
  • Continuous variables without significant differences in the scale is challenging to map but it’s possible to improve upon a baseline using the simple visual tricks above.
  • When showing hierarchy is an option, rather than exact continuous values, I like to use what I call hierarchal mapping to balance color, text, and attention.
  • Mapping can be an extremely effective way to communicate information and with a few handy tricks, we can make better visualizations that improve how our audience interprets the data and leads to more impactful presentations and data stories.

save | comments | report | share on


Introduction to NoSQL with MongoDB

  • Relational databases store data in tabular form with labelled rows and columns.
  • Although relational databases usually provide a decent solution for storing data, speed and scalability might be an issue in some cases.
  • SQL (Structured Query Language) is used by most relational database managements systems to manage databases that store data in tabular form.
  • The common structures adapted by NoSQL databases to store data are key-value pairs, wide column, graph, or document.
  • In order to overcome these challenges, MongoDB introduces a new format called BSON (Binary JSON).
  • However, MongoDB solves this issue by allowing users to export BSON files in JSON format.
  • We can store the data in BSON format and view it as JSON format.
  • Any file in JSON format can be stored in MongoDB as BSON.
  • We have previously mentioned that documents in MongoDB are organized in a structure called collection.

save | comments | report | share on


The FCC has its first Chairwoman in Jessica Rosenworcel

  • The Biden administration has officially appointed Commissioner Jessica Rosenworcel acting FCC Chairwoman, making her the first woman to hold the position, and she will likely be nominated to fill the position formally later in the year.
  • While Rosenworcel’s agenda will be made clear over the coming weeks and months, it is likely we will see the return of net neutrality from the shallow grave dug for it by Ajit Pai, and probably a new effort to better understand where in the country actually needs help getting broadband to those who need it, and how to do so quickly and equitably.
  • The work of an FCC Commissioner, their staff, and the bureaus they rely on, is largely obscure and technical, with moments like those listed above more the exception than the rule.

save | comments | report | share on


Exploratory Factor Analysis vs Principal Components: from concept to application

  • PCA is based on the formative model, where the variation in the component is based on the variation in item responses (i.e. level of income will affect the social-economic status).
  • We usually use two tests to measure if our data is adequate to proceed with EFA.
  • If data are normally distributed, it is recommended to use maximum likelihood, since it enables a variety of goodness of fit indices, significance test of factor loadings, calculation of confidence intervals, etc.
  • However, if the data doesn’t follow a normal distribution, it is recommended to use principal axis factoring.
  • The significance level was smaller than 0.05, which means we can proceed with EFA (if we assume values under 0.05 indicate the adequacy of our data).
  • There are few studies that evaluate the use of goodness of fit indices in EFA, therefore it may be difficult to interpret this part of the data.

save | comments | report | share on


How To Get a Job in Data Science or Analytics Without a STEM Degree

  • So if Google, Tesla, Apple, and many of the largest and most technical firms on the planet no longer require a degree— then why does it still have such a premium for people looking for technical jobs like Data Scientist, Analyst, or Data Engineer?
  • The key phrase from this quote is “record of exceptional achievement.” There is no way of getting around it — if you have no formal training, you have to prove to the hiring manager that you can perform the job purely through your experiences.
  • Already working somewhere is by far the best possible position to be in to make the jump to data science or analytics, as you likely already have access to some data and will be able to find relevant work that allows you to become an analyst without having the job title.

save | comments | report | share on