Point to point integration tips for Salesforce

Salesforce is an awesome piece of software. It is amazingly extensible and configurable, and I have seen people do amazing thing with it. All of this extensibility and ease of use can be deceiving though, because unlike people’s imaginations, there are some technical limitations. These limitations make it very difficult, or some times impossible, to build certain types of applications.

Salesforce is a wonderful platform to integrate to or from. The API is great, there are connectors for several different languages and integration middlewares, and you can create custom APIs in REST or SOAP. Using a proper integration platform such as Boomi, CastIron, Informatica, etc. you can pretty much do any kind of integration your heart desires. However, to keep costs low people often want to just have Salesforce talk directly to the other systems to fetch and save the necessary data, a.k.a. point to point integrations. This is where things can get very tricky.

Integrations involve complex aspects like data security, data transformation and mapping, handling high volume of data, error handling and recovery, notifications, etc. All of which comes out of the box with an integration middleware solution, but has to be custom built in Salesforce within the confinement of the “box” of Salesforce governor limits and apex language limitations. Also we have to remember that highly customized solutions in Salesforce tend to be harder to implement, extend, maintain, support, and administer.

Nowadays, any multi-tenant environment has to have limits due to poorly architected applications that can misbehave and impact others. If you compare the limits that Salesforce has with the limits of other platforms you will probably be disappointed. For example, Salesforce limits the heap size to 6Mb, while the free version of Google App Engine limits it to 100Mb. However this is a completely unfair comparison, and this is why:

Salesforce is a Software-as-a-Service solution. Google App Engine, AWS Beanstalk, Heroku, etc. are Platform-as-a-Service solutions. Comparing them is like comparing apples and oranges. Yes, PaaS solutions offer more to custom application developers, which is what they are designed to do. They ONLY support custom application development. Salesforce is a highly customizable CRM solution. It was never designed to be a custom application development platform. If you think about it this way, the simple fact that people make this comparison proves how amazing Salesforce is. Do you want to hear another unfair comparison going in the other direction? Compare the development cost and time to market of a tailored CRM solution in Salesforce vs. building one from scratch yourself in one of the PaaS vendors mentioned above.

I am not saying one is better than the other. All I am saying is that we are very fortunate for having so many options out there, and being able to select the right platform to do the right job. Which takes us back to integration orchestrating in Salesforce. Basically Salesforce is not the right platform to do it.

And the reason is simple, as I already mentioned. The issues are the governor limits and apex language limitations. They define a box, quite a small one by the way, and your customization has to fit in there. As the application grows and you add more features you start pushing more and more against the sides of the box, up to a point where you run out of wiggle room. On top of that you have to remember that existing customizations already take some space inside of that box and you have to be a good neighbor.

But just saying this doesn’t really drive the message home. The governor limits in that list mentioned above only really jump off of the page when you hit them. I certainly did and I want to provide some examples:

WSDL limitations
Salesforce.com has serious limitations on supporting modern wsdls. If someone asks you if you can invoke a given web service from Salesforce the correct answer is always,”I don’t know”

The first thing you do is to get the wsdl and import it into Salesforce. There is a chance it will import correctly, but likely you will have to tweak the code a little. One simple example is that in Salesforce any class with a name that ends in Exception has to extend the Exception, and many wsdls define custom exception classes.

If after a few tweaks you are able to save the generated classes that is a very good sign, but doesn’t mean you are in the clear yet. There are other issues you may encounter. For instance, if the service uses inheritance and polymorphism, Salesforce won’t support it and you won’t be able to invoke the service without some major changes to the generated classes. Check out my other blog post on this: Salesforce wsdl2apex and class inheritance.

If your service could not be imported, or if there are significant issues in the generated classes, all is not lost. One thing you can still do is to create the SOAP XML and invoke the service directly using the Http classes in Salesforce. SoapUI is a godsend if you want to do this. Try to invoke the service directly using SoapUI to learn the structure of the request and response, then use the Http classes to send the same request and parse the response using the DOM classes. This approach is not good, so use it as a last resort. If you absolutely have to do this beware of the heap size limit described below because the generating and parsing of xml files using the Salesforce classes uses a lot of memory and you may hit that limit.

Can’t catch governor limit exceptions
If you hit a governor limit the process will end, you don’t have a chance to catch the exception, save your work in progress, or revert things. The transactions are rolled back, and that is it!

The way to workaround this is to prevent hitting limits at all costs. Salesforce provides a convenient class called Limits that help you with that. You should spend time fine tuning your code to understand exactly which limits are being used, and when you start that process you should check if you have enough room to run it, or if you should break it down by saving the state of your execution so it can be resumed on the next execution.

Limited JSON parser
This one is more specific to Web Services that use JSON to transfer data. The JsonParser class in Salesforce is very limited so if you can create concrete Apex classes to de-serialize JSON into you should be good, but that is not always possible. To do lower level processing I feel like you are better off building your own JSON parser, or something on top of the JsonParser class, which is nothing more than a tokenizer.

Heap size limitation
For synchronous Apex the limit is 6mb, for asynchronous 12mb. This is very limited space specially if you are handling a slightly more than moderate volume of data and doing transformations.

As I mentioned I hit this limit when trying to parse an xml file, specifically this was a report generated by another system. The response wasn’t extremely large, just about a couple of megabytes. But when I tried to parse it with either the standard Salesforce classes or my own parsing logic, the number of strings being created and manipulated made me exceed the 6mb limit very easily. The solution for this particular case was to use a proxy service. I created a service in Google App Engine that would parse the xml and return a digested version of it in a comma separated values format with only the values Salesforce needed out of the report and in the correct sequence so it didn’t have to process anything, just save to the database. The free version of Google App Engine was more than enough to support the load.

Compression
Salesforce supports compressed content of http calls, but only when the compression is at the transport level, i.e. if the body of the response is a binary compressed file there is no way you will be able to read it in Salesforce.

To address this issue I also used Google App Engine. I created another proxy service that unzipped the body of the request and returned it as plain text to Salesforce.

10 call outs per execution
This is probably the hardest one to workaround. Within a Salesforce execution you may only make 10 external calls such as http calls or web services calls.

This means that you have to make every call out count, it is imperative to have services that deal in lists of objects rather than one by one. You also have to balance the amount of memory that is available with relative size of results you can process at a time.

When considering your API also keep in mind they must return in less then 120 seconds, which is not too bad, but it does make things trickier when you are trying to do as much as you can with each call out.

The tip here is to break the process up, build processes that can pick up where it previously left off, and try to break lengthy processes down and have it execute more often processing less data in each run.

You can use triggers, scheduled process or invoke your process from outside of Salesforce by exposing it as a web service.

No call outs during an active transaction.
This is another tricky one. If you want to make a call out to an http service or web service there must be no database transactions currently active. Usually to work around this you will have to keep a list of all of the objects you have to update and only save/create/delete after all the call outs needed by your service have been done.

This also means that it is not possible to perform a call out inside of a trigger, because triggers run in the context of a transaction. There is the concept of future calls, which are basically asynchronous invocations of apex code. If you make a future call in a trigger, the future call will run after the transaction is complete, and therefore you can make call outs. Keep in mind you only get 200 callouts per user license per day. That may sound like a lot, but if you use that to trigger your integration after the update of each record, and people update more than 200 records a day, or if you have multiple future calls per trigger, you can see how easy it is to hit that limit.

Future calls are valuable resources, and there is also a chance that you will be competing for the daily allowance with existing customizations in the instance.

It is usually a good idea to complement the future call pattern with a schedule job that sweeps for records that were not processed during the future calls.

Scheduler limitations
In Salesforce you can have scheduled jobs, but they are limited. The standard scheduler can only schedule hourly jobs. Programatically it is possible to schedule them to run more often, down to every minute, however you will have to schedule one job per minute in this case, for a total of 60 jobs, and each scheduled job counts against a limit of 100 scheduled jobs.

Also scheduled jobs are not guaranteed to run at the exact time they are scheduled, rather the Salesforce documentation says the job will be executed when resources become available.

Using a heartbeat pattern is also a good approach to work around this. The idea is that you expose a web service from Salesforce that triggers the integration, and then you have something outside of Salesforce, such as a cronjob running on a EC2 server, or a very simple app in Heroku or Google App Engine, or any other platform that has better scheduling features. Those simply invoke your service making the integration run. The complex logic still resides in Salesforce, but the scheduling happens outside.

General tips
Try to use Salesforce functionality as much as possible instead of trying to code everything. For instance, if you want to send an e-mail notification, instead of coding the e-mail you can create a workflow rule on field update to send a notification for you. We did this with errors. We have an error log object where we save information about issues and if the error type is Critical it will send an e-mail using a workflow rule.

Create processes that can save state. In other words, make processes that are parameterized, so that these parameters can be saved to a persistent store, such as a custom object, so that it can be invoked again. This way, if the process fails, or can’t finish all the work, it can be invoked again later to retry or finish the work. This is at the core of error handling and recovery.

Leave a Comment Cancel Reply