Wednesday, February 23, 2011

Undeploying BPEL process

Well, not any cutting edge stuff, but did cause us lot of problem for us during 10.1.3.5 upgrade. It seems like 10.1.3.5 and especially 10.1.3.5.2 has lot of XSD validations and it can cause BPEL process not to load if some of the validation fails.

It doesn't allow user to click on this process in BPELConsole. Appropriate action would be to undeploy or redeploy those processes but without clicking on that process you can not undeploy it.

1) One way to undeploy such process would be to clean up database. We tried following:
a) clean up the database tables
PROCESS
PROCESS_DEFAULT
PROCESS_DESCRIPTOR
PROCESS_LOG
SUITCASE_BIN
b) clean up the BPEL process under tmp directory (bpel/domains/domain-name/tmp)
c) clean up ESB in BPELSystem

This was quite cumbersome process.

2) The other way we found was to use following URL:
http://host:port/BPELConsole/<<domain>>/undeployProcess.jsp?processId=<<process-id>>&revisionTag=<<revision>>


It was much cleaner to undeploy this way.


Thursday, February 10, 2011

handleTopLevelFault - SOA 10.1.3.5

I think all of us know that transaction=participate in 10.1.3.3 was replaced by handleTopLevelFault=true in 10.1.3.5. I never realized the impact it had in our processes and how Oracle keep changing their mind on what this value is and how the server behaves.

First of all, in 10.1.3.3, all BPEL processes DON'T participate in transaction by default. Which means if you have, process A -> Process B -> Process C, and

[C] is disabled

try {
    insert table 1
    invoke C
} catch {
    insert table 2
    throw
}


In this case, upon the failure of invoke C, entry in table 1 or 2 will never roll back, even if data sources are configured for XA transaction. If you do put transaction=partipate in your process at configuration level, then this process and their operation would be in Global transaction and throw activity would roll back insert in table 1 and 2.

[Note if you don't have throw, it would act like there is no error and nothing would ever roll back in any scenario, which is quite expected.]

Now, in 10.1.3.4, they replaced transaction=participate with handleTopLevelFault=false. The default value in the code [com.collaxa.cube.engine.core.map.BPELProcessBlock] was true. That's why behavior would be exactly same as 10.1.3.3. Nothing would participate in transaction unless you explicitly specify it to be. You can not define this property at domain level, it is only supported at process level.

In 10.1.3.5, the default value in [com.collaxa.cube.engine.core.map.BPELProcessBlock] got changed to "false". It caused numerous problems to us. Based on AIA best practice, we always have our own try/catch block, we do some processing in catch block (e.g. update db, queue, etc) and then we rethrow the error so the caller can get details of the error and instance can show as faulted. We had the pattern mentioned earlier all over the places. In 10.1.3.3 we could see the data which were inserted from catch block, but in 10.1.3.5 everything just rolled back because it started participating in transaction by default.

I agree, not a very good coding practice but it worked in 10.1.3.3 and nobody realized it can cause transaction issues during upgrade. The ideal solution should be to be catch block in local transaction so during roll back it doesn't rollback your work in catch activity.

[C] is disabled
try {
    insert table A
    invoke C
} catch {
    insert table B [make sure DB adapter is using tx-level='local' in data source]
    throw
}


Anyways, we had more than 200+ processes and we couldn't afford changing the code for each one of them, so I came up with neat solution which worked out without any issues. I wrote BPELClient API to insert "handleTopLevelFault=true" in majority of the BPEL processes, and that's it. No restart and no code change and life was back to normal. The only flow would be that if somebody redeploys the code and overwrites descriptor, but we can just run the util again in that case.

Here is the how code looked like:
ProcessDescriptor processDescriptor = bpelProcessHandle.getDescriptor();
ConfigurationsDescriptor configurationsDescriptor = processDescriptor.getConfigurations();
configurationsDescriptor.setPropertyValue("handleTopLevelFault",appropriateValue);  // set or add
bpelProcessHandle.updateDescriptor(processDescriptor);




Another side effect of handleTopLevelFault



If your process has handleTopLevelFault = true, it will show the Faulted Instance in Flow view and Audit view in BPELConsole

if your process has handleTopLevelFault = false (or don't have this property in 10.1.3.5), it will not show the instance in flow view and will only show in Audit view. That was some new enhancement added in 10.1.3.5.

I fixed the JavaScript issue so it shows up the faulted instance in Flow view even if you have handleTopLevelFault=false. Details can be found at http://chintanblog.blogspot.com/2011/01/bpelconsole-faulted-instance-in-10135.html

Thursday, February 3, 2011

BAM Filtering / Governance 11g

In one of the project, we were trying to push lot of data to BAM (some of the trace data along with data required for reports). We needed a solid approach were we can control the amount of data going to BAM at run time, just like logging level (info/debug). I agree that for infrastructure and instance tracing, CAMM, Amberpoint or Grid Control would be a better solution, but not all clients have luxury to invest in those tools. Sometime EM and BAM can help with basic troubleshooting.

The only out of the box approach I found was to use EM to disable monitoring at composite level. Pretty much nothing if you are using BAM adapter (e.g. in Mediator). We can also use following at composite level, but it requires redeployment:

   <property name="disableProcessSensors" many="false">true</property>

Following was the pretty intuitive approach we found to control the data going to BAM and we can stop/start/control BAM publishing at runtime. In BAM Sensor (Sensor action) there is feature called Filter. From there we can call any XSL expression, so we wrote custom XSL function which reads some configuration from JVM. We can control the BAM sensor publish event by returning true/false from this function.


We used the similar thing in Mediator, but it was quite simple as it also provides filter expression.


This way we had complete centralized control over BAM publish.