Einstein Analytics certified… finally!

June has been really exciting so far as I finally managed to complete my Einstein Analytics and Discovery Consultant certification!

Table of Contents

Preparing for the Einstein Analytics and Discovery Consultant Certification

As every Salesforce consultant certification, the “Einstein Analytics and Discovery” certification is a mixture of scenarios that you have to solve, some pseudo-“debugging” and a good number of questions that basically test your knowledge. Read the exam guide and make sure that you complete the Trailhead Superbadges for Data Preparation and Analytics & Discovery Insights. Kelsey Shannon has blogged very comprehensively on her certification journey and Charlie Prinsloo has written the definitive preparation guide (some links require partner community access, though).

Get an EA org (either trial or developer)

If you don’t have experience with Einstein Analytics, then your starting point is getting an org. There is a free (and more or less perpetual) developer org available, and if you want to have a look at the fully configured “real thing”, Salesforce now offers a fully functional 30 days trial packed with sample data, sample dashboards and apps.

Watch the academy video training

If can ‘t attend an in-person “Einstein Analytics Academy” class, the EA team has a great alternative for you: Ziad Fayed has recorded a full training as a series of free webinars. It is your number one resource if you want to pass the Analytics certification and I recommend watching *and building* the solutions Ziad is presenting.

http://salesforce.vidyard.com/watch/H4efJNA9wgBKQL23ef7cGr?

Use the Templated Apps

It might sound strange, but it’s highly recommended to use your developer org to create at least two essential apps from the App Templates Salesforce provides:

The Learning App has examples for essential techniques such as bindings.
The Sales Analytics App has functional examples of a sync scenario, complex dataflows, and dashboard designed as best practice.

Speaking of templates: You can score some easy points in the exam if you know

why to use templates (to build AND to distribute apps)
what they can consist of and how the bundle is actually defined (check the Analytics Template Developer Guide)

Know Dataflows & Recipes inside out

Though it’s not a universal truth, for the sake of the exam stick to the best practice that Sync, Uploads and Dataflows ingest all data into analytics, and recipes work off datasets. You’ll see that once you set up synced sources, you can use the synced source straightaway in a recipe to prepare a dataset. Yet this is not what the certification exam is about.

Know the limitations of synced external sources (such as an Azure DB) as compared to synced Salesforce objects (it’s a good idea to know the limits in this area: How many Salesforce objects can you sync? How many dataflows can you have?)

For dataflows (aka “the purple nodes”), you should know each node type and understand what you use it for:

Dataset Builder (aka “the blue nodes”) is exclusively for Salesforce objects. It helps you to find the finest grain, all related objects and allows you to select fields and relations.
sfdcDigest reads from a synced Salesforce object
edgemart reads from a dataset in Analytics (read: re-use an existing dataset)
sfdcRegister saves a dataset
append works like the “union” command – it adds the rows of a second dataset to the existing dataset.
augment “joins” one dataset to another by adding fields to existing rows, not new rows. In simple words: you choose the key on the “left hand side” (the data you already have), choose the dataset you want to join, select the field that matches the key (also: decide if you expect single or multiple matches, and which fields you want to add. The outcome will be the same number of rows with more columns/fields.
computeExpression lets you create new fields or recalculate values based on the fields in the same row. If row 10 has a value for “Quantity” of 10, the value for “ProductName” is “Cherry Cake”, you can create a formula to create a new field “Line Item Label” with the value ‘Quantity’ + ‘ProductName’ + “s”, which will build “10 Cherry Cakes”.
computeRelative allows you to compare or summarize a row with previous ones (row over row, or based on a grouping that is used as a “partition”).
dim2mea is a handy tool to convert a dimension to a measure if you need to do that. Unfortunately, there’s no mea2dim (if you accidentally read numeric product number as a measure). If you need this, you’ll have to use computeExpression, generate a String field, copy the value, and convert it to a String.
flatten allows you to convert a hierarchy into directory or path like representation of the hierarchy that gets access to the row. You can decide whether or not to include what Analytics call the self_id. The difference it makes is team visibility. Should a team always share their records, you’d need to set “include_self_id” to false. Imagine two records that include self_ids, one has “me/myteam/mymanager/ourboss”, the other one has “mycolleague/myteam/mymanager/ourboss”. They won’t be able to see their records respectively. If you set “include_self_id” to false, both will get “myteam/mymanager/ourboss” as their hierarchy path and by that are eligible to be shared among all members of “myteam”
prediction allows you to run a prediction from a Discovery model on your dataset’s rows (only available for Einstein Analytics Plus).
filter does, what the name says. It filters records that either match or don’t match a criteria.
sliceDataset acts like a filter, but for columns. You can choose whether you want to specify the columns/fields to drop or keep.
digest reads from any connected source and object (read: synced external data)
update does, what the name says: It updates a dataset with the changes you made. It’s basically a digest node that writes to the same dataset.
export WAS used to push a dataset to Discovery. Nowadays you can do that in the UI with a button on the dataset. By default, it only works with Discovery, and Discovery is only available with Einstein Analytics Plus licenses.

Know that the finest grain of your dataset is always determined by what you want to analyse. If your grain is not fine enough (let’s say you only loaded Opportunities, but not Opportunity Line Items, so there’s no way to get to the product level with this dataset. You can load the Line Items in a separate dataset and augment with the existing Opportunity Data, but in this case, rebuilding the dataset from scratch would be better.

On the other hand, you can’t run aggregations in Dataflows, so you can’t reduce the grain either. Groupings will help you there.

Exploration, Visualization & Dashboard Design

The exam parts that focus on Exploration and Visualization seem to be quite straightforward. If you know how to navigate the application, know key principles (progressive disclosure) for Dashboard design and know how to review (Dashboard inspector) and improve dashboard performance (e.g. pages, global filters, combine steps and such), you should be able to ace this section. Don’t forget to look into Actions and remember the C-A-S-E-S formula for good data analysis!

A particular focus should be on bindings – there are only a few questions on bindings, but you really need to know them to score these points. Consider building each binding type at least once and make sure that you understand what “results binding” vs. “selection binding” means. Look up, what a “nested binding” is (not a separate type but a specific way to use a binding), and make sure you understand the functional blocks of binding syntax. One top resource for that is Rikke Hovgaard’s blog (start here) – hint, hint: Rikke authored *some* questions for the exam (guess which ones…).

Security and Access

Another topic that is both straightforward and tricky at the same time.

Review how to get people access to both Einstein Analytics and Apps that you’ve built.
Understand the roles (they’re different than “Roles” in Salesforce).
Again, there’s a marginal topic in “Inherited Sharing” vs. “Security Predicate” , but you can score some precious points there. Make sure you know the limitations of inherited sharing, and how you can leverage security for cases where you hit a wall with inherited sharing.

Einstein Discovery

For Einstein Discovery, it’s crucial to know a bit about how data gets into Discovery, and how to analyze and improve the model quality. The discovery part of the exam is too large to be neglected, but still small enough that it won’t blow up your test immediately if you fail some questions here.

Data can be pushed from Einstein Analytics and other sources, including CSV. Click the import path for both EA and CSV, review the imported data and select the data types, review the columns, the outcome variable (a single one) and the predictors / actionable variables (up to three). You will see that some columns are closely related and Discovery can prompt you to review if they really represent the same thing (such as Product Number and Product Name) – or if they is just a very high correlation. You typically want to drop data only if you really know that they mean the same thing – when in doubt or you don’t know, then don’t make assumptions.

Understand the impact of outliers / extreme values: Typically these should NOT be in your analysis because you don’t want edge cases to drive your prediction. Don’t be shy to trim at least everything beyond the 6th standard deviation.

Finally, you should know how to read and understand the charts used by a Discovery story and the quality metrics. While everyone knows bar charts, Waterfalls charts are lesser known, so it might be a good idea to review if you really understand how Discovery uses both types to present data to you.

At the time of writing, there are just a few flashcard sets available to memorize the stuff. You can find the handful of them by searching for Einstein Analytics, combined with any EA specific term. While it helps you massively to memorize terms, limits etc, the one thing that will drastically improve your chances is reading the exam guide closely, get hands-on experience and/or actively follow the academy training videos. You can use the old Advanced Accreditation form to test your knowledge still. It will give you an idea what the Analytics team thinks you should focus on, even if the test is only for you to test your knowledge and will neither be scored nor will it give you an accreditation.

General Guidance

The general tips for all Salesforce exams apply here as well:

know the pass score and know what it means in numbers of questions. There will be 60 questions and the pass score is 68%. So 41 correct answers will let you narrowly pass, and there are up to 19 question that you can miss.
use the “mark for review” checkbox whenever you’re not 100% sure about your answer (it will give you a good overview later). Immediately after the last question, you will get the chance to review your checked questions – if your number is 15 or above, it’s a good idea to review all checked questions. Remember that there are probably some wrong answers among those questions that you DIDN’T check for review.
Read questions AND answers closely. Really, really! There’s a lot of information that you will only recognize on the second or third read. And you will be more successful to separate bogus answers from the correct ones if you scrutinize every single word.
There aren’t just “correct” and “wrong” answers – there are also items that are called “distractors” that could be correct… or almost correct. Scan each question and answers thoroughly for tiny deviations from Salesforce terminology, such as “computeField” (the real term for a dataflow function to compute a field is “computeExpression”). Scan for plural vs. singular, scan for the wrong order of steps.
If you don’t know the answer, try to rule out wrong answers.
If you still have no clue, check the “mark for review” checkbox and don’t waste more time on this item.

I hope this helped you a bit. Good luck with the exam, and let me know how you did!