Monday, September 18, 2017

Salesforce Big Objects

Was just experimenting with Salesforce Big Objects and found it quite interesting. It is mainly used for big data (100M+) analytics and mostly for asynchronous data crunching like Hadoop. However, there are very critical distinction before we go with Big Object.


  • Currently Big Object only support fields and permission, and that's about it
  • We can not have
    • triggers
    • page layouts
    • extensive SOQL (indexed SOQL is supported but that's extremely limited - and make sense as we are dealing with humongous data set)
    • no workflows, process builders, etc..
    • no reports
  • Basically it is completely non UI, and just for back end data stores for big data analytics - and that's about it.

Use case

In org, we can have survey on and Object record (e.g. Account, Opportunity, etc..), and would want to store those survey data in Big Object and later analyze it. 


How to user it :


1. Create Big Object

  • There is no User Interface to create the Big Object and its fields. We must use metadata API (using Ant Migration tool or workbench) to create such artifacts. Obviously, workbench makes it a lot easier.
  • Create object file 
  • Create permission set file
  • Create package.xml file
  • Nicely bundle them in .zip file and in right directory structure (you can download it from here)
  • Use workbench to deploy the .zip file and BigObject should like below
  • You can assign permission set (Survey_BigObject) to right user so they can query and update the data



  • Pay close attention to indexed fields, this is used for inserting records - identifying duplicates, and issuing synchronous SOQL queries. 

2. Insert Data

  • We can insert data just like we do in Apex for any other Object, or we can use Bulk API
  • There is no upsert operation, salesforce will automatically check the record being inserted against the indexed value, and if index values are same, then it will do update. otherwise insert.
  • Upon failure, there are no errors, we just need to look at saveResult or saveResults.

 Account a = [ select id, name from account limit 1 ];  
 Survey__b survey = new Survey__b();  
 survey.WhatID__c = a.id;  
 survey.WhatTime__c = System.today() + 1;  
 survey.WhatObject__c = 'Account';  
 survey.Question1__c = 'What is the rating';  
 survey.Answer1__c = '1';  
 Database.SaveResult saveResult = database.insertImmediate(survey);  
 System.debug( ' success ' + saveResult.isSuccess() + ' ' + saveResult );  
   



3. Query Data

  • Querying data is quite tricky with Big Object. Either you can query all records, which most probably is going to fail when we have millions of records
  • Or synchronous SOQL can only be issued against indexed fields. And also indexed fields must be in order in the query. See below for example:
 List<Survey__b> surveys = [ select id, WhatId__c, WhatObject__c, WhatTime__c, Question1__c, Answer1__c from Survey__b ];  
 for(Survey__b survey : surveys ) {  
   System.debug( survey );  
 }  
   
   
 System.debug(' -------------- indexed query -------------- ');  
 /** no gap is allowed and only indexed field in exact order can be used for query , we can skip but no gap is allowed, e.g. below  
 *  [select id from Survey__b] is fine  
 *  [select id from Survey__b where WhatID__c = :a.id ] is fine  
 *  [select id from Survey__b where WhatTime__c = :System.today() ] is NOT fine, as you can't jump to index2 without having index1 in the query.  
 * **/  
 Account a = [ select id, name from account limit 1 ];  
 List<Survey__b> surveys2 = [ select id, WhatId__c, WhatObject__c, WhatTime__c, Question1__c, Answer1__c from Survey__b where WhatID__c = :a.id and WhatTime__c = :System.today() ];  
 for(Survey__b survey : surveys2 ) {  
   System.debug( survey );  
 }  
   
   

4. Asynchronous SOQL


  • Asynchronous soql is only supported using Rest API
  • We have to provide asynchronous SOQL and then the custom Object to store the result
  • It seems like only one Async SOQL can run at any given time - at least in org I worked on

4.1 Create Custom Object To Store Async Result


  • Created suvey analysis object to store the analysis of the query with counts

4.2 Run Asynchronous SOQL

  • Below is how asynchronous SOQL looks like, we need to provide SOQL, and then target table and mapping between selected field and target table's field.

 {  
   "query": "select Question1__c, Answer1__c, count(whatId__c) c from Survey__B where WhatObject__c = 'Account' group by Question1__c, Answer1__c",  
   "operation": "insert",  
   "targetObject": "Survey_Analysis__c",  
   "targetFieldMap": {  
     "Question1__c": "Question1__c",  
     "Answer1__c": "Answer1__c",  
     "c":"Count__c"  
   }  
 }  



  • We can execute it using Rest API on Workbench


Once Asynchonous SOQL query job is completed, we can query Survey_Analysis__c  object for accumulated result.


17 comments:

Unknown said...

It was nice to get an idea of big object from your blog, it is nicely documented. We have currently 1.5 million records in a custom object,w e want to archive the records every three month into big object and then use the big object to display report in salesforce. Could you please help to in understanding the architecture to be implemented.

Anonymous said...
This comment has been removed by the author.
Chintan Shah said...

I guess either via Heroku or Batch job.

James said...

I tried to insert the same code from developer console but didn't work..Is there any limitation that we can not execute big object code from developer console.

Chintan Shah said...

James, what error do you get? I was able to.

Unknown said...

Hi,
I'm getting an Error while executing Async SOQL as "errorCode":"INSUFFICIENT_ACCESS","message":"You don’t have permission to query this object Account with Async SOQL.
What Permissions I have to give ,Am I Missing anything here ...Please Let Me Know Thanks in Advance

Unknown said...

Hi Mubeena,

During POC done by me I got a response from salesforce team that to enable Async SOQL in any org, you need to buy atleast a 50 million record Block of Big Object.
Now, what do you mean by 50 Million Block?

It can be understood that by default Salesforce provides 1 Million record as a storage space in Big Object. To upgrade the limitation of 1 Million, customer has to purchase storage in Big Object as 50 Million Records Block. You can buy multiple such blocks i.e N * each 50 Million Block.

And Enable Asycn SOQL can be done only if you buy atleast a 50 Million Block.

Price quoted for a 50 Million Block in 9600 UK Pound.

I hope it helps.

Thanks,
Srikant

Unknown said...
This comment has been removed by the author.
Anonymous said...

It is very nice blog to help about the Big object when I was created the Big object by the ANT Script,
But by the "WorkBench" it was not created, Can you please help me to create the Big object by the "Workbench"
Thanks in Advance.

Unknown said...

Hi,
How to delete bigObject record ??i'm using database.deleteimmediate() but i'm getting DML operations are not allowed. may I know how to delete.

AP Cartoons said...

I am getting below error when I try to insert a record

Line: 5, Column: 9
Variable does not exist: WhatObject__c

Unknown said...

@AP Cartoons
check your permissions for field WhatObject__c in Big object

AP Cartoons said...

oh!! overlooked that part. Thank you.

Sweety Kumari said...

We have currently 1.5 million records in a custom object,we want to movethe records after every three month into big object and then use the big object to display report in salesforce. we are implementing using batch apex. Record is inserted into big object. But not showing any record when i am querying using SOQL.
Please some provide me solution, where i am doing mistake?

Unknown said...

@ Sweety :

Big Objects don't support "Database.QueryLocator()" you will get "Internal Salesforce Error" so maybe because of that you are not getting any records. Use Database.Query() instead it will work.

Unknown said...

I was trying fetch index field (Of big object) definition using tooling API query. But the behavior is very strange when I check the schema using setup it does shows index field correctly but tooling API did not returned same results.

What I had observed was “IsIndexed” flag itself was showing wrong values, then how come same big object schema shows correct index via setup.

sample query#
Returns correct results –
SELECT QualifiedApiName, IsIndexed FROM FieldDefinition WHERE IsIndexed = true
AND EntityDefinition.QualifiedApiName = ‘Account’

Returns wrong results
SELECT QualifiedApiName, IsIndexed FROM FieldDefinition WHERE IsIndexed = true
AND EntityDefinition.QualifiedApiName = ‘CustomeBigObject__b’

ID, false //but this is marked as index field in big object
Name, false //but this is marked as index field in big object
Comments, false
e.g. If there is only one index field then above query returns zero records; If three fields were marked as index then it returns first index field info. But same tooling API query for standard objects returns correct data.

any idea?

Unknown said...

This blog is amazing!