The Base58 Encoding Scheme
- For example, encoding data using a human alphabet in a way that a person can visually confirm the encoded data can be more beneficial than encoding it in binary form.
- It is different from Base64 in that the conversion alphabet has been carefully picked to work well in environments where a person, such as a developer or support technician, might need to visually confirm the information with low error rates.
- Second, the non-alphanumeric characters + (plus), = (equals), and / (slash) are omitted to make it possible to use Base58 values in all modern file systems and URL schemes without the need for further system-specific encoding schemes.
- Then, for every byte in 'b58_array', map the byte value using the Base58 alphabet in the previous section to its corresponding character in 'b58_encoding'.
Feature scoring metrics in word-document matrix
- If we calculate the TF-IDF of these two words in Doc1, We figure out that the TF-IDF of the word “authentication” and “evaluation” in document “Doc1” are equal.
- However, “Doc1” is in the category of “network” and the word “authentication” is more related to “network”.
- Now, if we calculate the PMI between these two words and the category of “network”, the PMI score of “authentication” is greater than the PMI score of “evaluation” in documents with the category of “network”.
- In this article, we compare two feature scoring metrics: TF-IDF and Pointwise Mutual Information (PMI).
- The PMI score finds the importance of a word in a category.
- It is worth to mention that there is another mean of PMI to calculate the association between different words in a corpus of documents with the purpose of dimensional reduction.
First shipments of Pfizer vaccine to be delivered December 15
- The document, provided to governors ahead of a call with the Vice President Monday, also estimated the first shipment of Moderna's vaccine will be delivered on December 22.
- The FDA's Vaccines and Related Biological Products Advisory Committee, a panel of independent experts, will meet on December 10 to review Pfizer's data and make a recommendation to the FDA about whether to authorize the vaccine.
- The document outlined a four-day window between December 11 and December 14 for review by the FDA and CDC's Advisory Committee on Immunization Practices, which makes recommendations about who should receive the vaccine first.
- After a four-day review window from December 18 to December 21, the document stated that the first shipments of Moderna's vaccine are estimated to be delivered on December 22.
Tesla case against Martin Tripp raises questions about scrap costs - Business Insider
- Tripp's case was settled on Tuesday for $400,000, but court documents leaked during the case point to hundreds of millions in wasted materials unaccounted for on Tesla's balance sheet.
- Tesla sued Tripp in 2018 after he leaked documents that indicated that Model 3 production had generated $140 million worth of scrap in production from January to June of that year.
- The leaked documents indicate that scrap at the factory was a major concern at the highest levels of the company — from CEO Elon Musk and CFO Deepak Ahuja down the chain of command.
- On June 8th, 2018 Tripp sent a detailed email to his bosses (including Elon Musk) warning them that if the company didn't do anything to control battery scrap as it ramped up production of Model 3s to 5,000 cars a week, the cost could total $200 million in the second half of the year alone.
F-beta Score in Keras Part III
- In part II of this article, we implemented f-beta score in Keras for multiclass problem both as a stateful and stateless metric.
- In this article, we will be explaining how f-beta score can be applied to multi-label classification problem and creating both stateful and stateless custom f-beta metric for multi-label classification problem in Keras.
- Like in multiclass problem, metrics like f-beta score can be calculated per class before aggregating using either of micro, macro and weighted methods.
- For multi-label f-beta metric, state variables would definitely be true positives, actual positives, predicted positives, number of samples and sum of f-beta scores because they can easily be tracked across all batches.
- F-beta score can be implemented in Keras for multi-label problem either as a stateful or a stateless metric as we have seen in this article.
Leaked files reveal China's chaotic early Covid-19 response
- In a report marked "internal document, please keep confidential," local health authorities in the province of Hubei, where the virus was first detected, list a total of 5,918 newly detected cases on February 10, more than double the official public number of confirmed cases, breaking down the total into a variety of subcategories.
- Even as authorities in Hubei presented their handling of the initial outbreak to the public as efficient and transparent, the documents show that local health officials were reliant on flawed testing and reporting mechanisms.
- On March 7, the total death toll in Hubei since the beginning of the outbreak stood at 2,986, but in the internal report it is listed as 3,456, including 2,675 confirmed deaths, 647 "clinically diagnosed" deaths, and 126 "suspected" case deaths.
What is Data?
- However, as data are collected and processed, the most useful context information is often lost, making it harder to be discovered and further leveraged.
- On the contrary, data need to be collected first to represent a group of facts and then processed to provide meaningful information.
- Next, all the data’s contextual information should be documented, including use cases, data sources, data collection methods, data transformation rules, related references, etc.
- Furthermore, as an organization matures in data governance, documenting the data by working directly with the Data Catalog should become a habit for everyone who works with or uses the data, ultimately leading to a happy business community given the transparency to trust the data provided to them.
- Data scientists, product managers, and data analysts play the role of bridging the gap by documenting the data, championing the data architecture, leveraging the data catalog, supporting data governance, and ultimately enabling the business to discover and realize the potential value of the data fully.
Border patrol settles lawsuit with US citizens detained after an agent heard them speaking Spanish
- Ana Suda and Martha "Mimi" Hernandez, who live in the small town of Havre, Montana, say they were detained by a border patrol agent while waiting to pay for groceries at a local convenience store in May 2018.
- The agent, Paul O'Neill, approached them, commented on their accent, and asked where they were born, according to the ACLU.
- When they responded -- Suda is from Texas and Hernandez is from California -- he asked to see identification and questioned them for 40 minutes, they say.
- In a statement to CNN, Border Patrol said their agents are trained to enforce laws uniformly and fairly, without discriminating based on religion, race, ethnicity or sexual orientation.
- Suda, who recorded the incident on her cellphone, is shown in the video asking the agent why he questioned them.
5 Technical Behaviors I’ve Learned from 2 Years of Data Science and Engineering
- Getting feedback on the direction you are taking and the conclusions you are making from the data can help you decide on your next steps.
- I have found this helpful with software and data science projects as you can get an initial direction for the project and start getting feedback.
- Whether you are creating a dashboard or developing a software library, creating an MVP first and getting feedback will help make sure you are going in the right direction with the product.
- Placing in MVP’s and brief research projects has helped determine if our roadmap needs to shift directions to keep aligned with what is or is not working.
- Learning to document and read code are two valuable skills I took for granted during my college years.