Chintan Shah's Blog

Wednesday, February 23, 2011

Undeploying BPEL process

Well, not any cutting edge stuff, but did cause us lot of problem for us during 10.1.3.5 upgrade. It seems like 10.1.3.5 and especially 10.1.3.5.2 has lot of XSD validations and it can cause BPEL process not to load if some of the validation fails.

It doesn't allow user to click on this process in BPELConsole. Appropriate action would be to undeploy or redeploy those processes but without clicking on that process you can not undeploy it.

1) One way to undeploy such process would be to clean up database. We tried following:
a) clean up the database tables
PROCESS
PROCESS_DEFAULT
PROCESS_DESCRIPTOR
PROCESS_LOG
SUITCASE_BIN
b) clean up the BPEL process under tmp directory (bpel/domains/domain-name/tmp)
c) clean up ESB in BPELSystem

This was quite cumbersome process.

2) The other way we found was to use following URL:

http://host:port/BPELConsole/<<domain>>/undeployProcess.jsp?processId=<<process-id>>&revisionTag=<<revision>>

It was much cleaner to undeploy this way.

Thursday, February 10, 2011

handleTopLevelFault - SOA 10.1.3.5

I think all of us know that transaction=participate in 10.1.3.3 was replaced by handleTopLevelFault=true in 10.1.3.5. I never realized the impact it had in our processes and how Oracle keep changing their mind on what this value is and how the server behaves.

First of all, in 10.1.3.3, all BPEL processes DON'T participate in transaction by default. Which means if you have, process A -> Process B -> Process C, and

[C] is disabled

try {
    insert table 1
    invoke C
} catch {
    insert table 2
    throw
}

In this case, upon the failure of invoke C, entry in table 1 or 2 will never roll back, even if data sources are configured for XA transaction. If you do put transaction=partipate in your process at configuration level, then this process and their operation would be in Global transaction and throw activity would roll back insert in table 1 and 2.

[Note if you don't have throw, it would act like there is no error and nothing would ever roll back in any scenario, which is quite expected.]

Now, in 10.1.3.4, they replaced transaction=participate with handleTopLevelFault=false. The default value in the code [com.collaxa.cube.engine.core.map.BPELProcessBlock] was true. That's why behavior would be exactly same as 10.1.3.3. Nothing would participate in transaction unless you explicitly specify it to be. You can not define this property at domain level, it is only supported at process level.

In 10.1.3.5, the default value in [com.collaxa.cube.engine.core.map.BPELProcessBlock] got changed to "false". It caused numerous problems to us. Based on AIA best practice, we always have our own try/catch block, we do some processing in catch block (e.g. update db, queue, etc) and then we rethrow the error so the caller can get details of the error and instance can show as faulted. We had the pattern mentioned earlier all over the places. In 10.1.3.3 we could see the data which were inserted from catch block, but in 10.1.3.5 everything just rolled back because it started participating in transaction by default.

I agree, not a very good coding practice but it worked in 10.1.3.3 and nobody realized it can cause transaction issues during upgrade. The ideal solution should be to be catch block in local transaction so during roll back it doesn't rollback your work in catch activity.

[C] is disabled
try {
    insert table A
    invoke C
} catch {
    insert table B [make sure DB adapter is using tx-level='local' in data source]
    throw
}

Anyways, we had more than 200+ processes and we couldn't afford changing the code for each one of them, so I came up with neat solution which worked out without any issues. I wrote BPELClient API to insert "handleTopLevelFault=true" in majority of the BPEL processes, and that's it. No restart and no code change and life was back to normal. The only flow would be that if somebody redeploys the code and overwrites descriptor, but we can just run the util again in that case.

Here is the how code looked like:

ProcessDescriptor processDescriptor = bpelProcessHandle.getDescriptor();
ConfigurationsDescriptor configurationsDescriptor = processDescriptor.getConfigurations();
configurationsDescriptor.setPropertyValue("handleTopLevelFault",appropriateValue);  // set or add
bpelProcessHandle.updateDescriptor(processDescriptor);

Another side effect of handleTopLevelFault

If your process has handleTopLevelFault = true, it will show the Faulted Instance in Flow view and Audit view in BPELConsole

if your process has handleTopLevelFault = false (or don't have this property in 10.1.3.5), it will not show the instance in flow view and will only show in Audit view. That was some new enhancement added in 10.1.3.5.

I fixed the JavaScript issue so it shows up the faulted instance in Flow view even if you have handleTopLevelFault=false. Details can be found at http://chintanblog.blogspot.com/2011/01/bpelconsole-faulted-instance-in-10135.html

Thursday, February 3, 2011

BAM Filtering / Governance 11g

In one of the project, we were trying to push lot of data to BAM (some of the trace data along with data required for reports). We needed a solid approach were we can control the amount of data going to BAM at run time, just like logging level (info/debug). I agree that for infrastructure and instance tracing, CAMM, Amberpoint or Grid Control would be a better solution, but not all clients have luxury to invest in those tools. Sometime EM and BAM can help with basic troubleshooting.

The only out of the box approach I found was to use EM to disable monitoring at composite level. Pretty much nothing if you are using BAM adapter (e.g. in Mediator). We can also use following at composite level, but it requires redeployment:

   <property name="disableProcessSensors" many="false">true</property>

Following was the pretty intuitive approach we found to control the data going to BAM and we can stop/start/control BAM publishing at runtime. In BAM Sensor (Sensor action) there is feature called Filter. From there we can call any XSL expression, so we wrote custom XSL function which reads some configuration from JVM. We can control the BAM sensor publish event by returning true/false from this function.

We used the similar thing in Mediator, but it was quite simple as it also provides filter expression.

This way we had complete centralized control over BAM publish.

Tuesday, January 18, 2011

BPELConsole Faulted Instance in 10.1.3.5

Recently we did upgrade of 10.1.3.5 and found that if you have faulted instance in BPELConsole, it doesn't show up the flow view. You can still see the details from Audit tab but flow tab shows following error (Javascript):

Line: 57
Char: 48
Error: 'this.wi.state' is null or not an object
Code: 0
URI: http://localhost:7777/BPELConsole/default/lib/flowviewer.js?gvpn0m

- First of all, faulted instances are the one which you want to look into BPELConsole first
- Audit tab is very busy, I believe flow view is graphical and very compact

Upon following up with Oracle, we found that it is expected behavior! (http://download.oracle.com/docs/cd/E14101_01/doc.1013/e15342/bpelrn.htm#BABCGIBB). Since when, the javascript error on the webpage become expected behavior??

Anyways, it was time back to get into dirty details of the JSP and Javascripts but I finally was able to fix the issue by just changing flowviewer.js file. Here are the steps, which should be applicable to any version:

1) open $ORACLE_HOME/j2ee/oc4j_soa/applications/orabpel/console/lib/flowviewr.js
2) You can use online beatifier as this JS is very compact, I used :
http://jsbeautifier.org/
3) Search for open.faulted, you should see something like

                } else {
                    if (this.wi.state == "4" || this.wi.state == "open.faulted") {
                        this.checkforESB(e);
                        this.templateId = "activity.faulted";
                        return
                    } else {
                        if (this.wi.state == "5" || this.wi.state == "6" || this.wi.state == "7" || this.wi.state == "8" || this.wi.state == "11" || this.wi.state == "12" || this.wi.state.substring(0, 6) == "closed") {
                            this.checkforESB(e);
                            this.templateId = "activity.closed";
                            return
                        } else {
                            this.checkforESB(e);
                            this.templateId = "activity.pending";
                            return
                        }
                    }
                }

4) Make changes as below:


                } else {
                    if (    (  this.wi.state == null &&   this.type == "4") ||    this.wi.state == "4" || this.wi.state == "open.faulted") {
                        this.checkforESB(e);
                        this.templateId = "activity.faulted";
                        return
                    } else {
                        if (   this.wi.state == null ||   this.wi.state == "5" || this.wi.state == "6" || this.wi.state == "7" || this.wi.state == "8" || this.wi.state == "11" || this.wi.state == "12" || this.wi.state.substring(0, 6) == "closed") {
                            this.checkforESB(e);
                            this.templateId = "activity.closed";
                            return
                        } else {
                            this.checkforESB(e);
                            this.templateId = "activity.pending";
                            return
                        }
                    }
                }

5) On IE or Firefox, just do ctrl+F5 to refresh your local copy of JS, and you should be able to see faulted instance just like before.

If you are using 10.1.3.5 or 10.1.3.5.2, you can just download flowviewer.js from here.

OWSM gateway clean up

In OWSM fresh install, usually you get following error message in gateway.log file:

2011-01-15 18:39:37,194 FINEST [Thread-77] configuration.PolicySetWatchdog - Failed to retrieve policy set from policy manager with url http://host:7777/policymanager/services/RegistrationService
com.cfluent.policymanager.sdk.base.exception.ServerException: java.lang.Exception: Invalid component ID - C0003001
        at com.cfluent.policymanager.sdk.client.soap.SoapComponentConfigurator.getPolicies(SoapComponentConfigurator.java:185)
        at com.cfluent.agent.configuration.PolicySetWatchdog.getPolicySetFromPolicyManager(PolicySetWatchdog.java:168)
        at com.cfluent.agent.configuration.PolicySetWatchdog.pollFromPolicyManager(PolicySetWatchdog.java:207)
        at com.cfluent.agent.configuration.PolicySetWatchdog.run(PolicySetWatchdog.java:91)
2011-01-15 18:39:49,484 FINEST [Thread-77] configuration.PolicySetWatchdog - Checking Policy Manager
2011-01-15 18:39:49,530 WARNING [Thread-77] configuration.PolicySetWatchdog - Failed to retrieve policy set from policy manager with url http://host:7777/policymanager/services/RegistrationService: com.cfluent.policymanager.sdk.base.exception.ServerException: java.lang.Exception: Invalid component ID - C0003001
2011-01-15 18:39:49,530 FINEST [Thread-77] configuration.PolicySetWatchdog - Failed to retrieve policy set from policy manager with url http://host:7777/policymanager/services/RegistrationService

This is known issue, because during fresh install there is no gateway and there is no component called C0003001. So if you create gateway component then it will get the id of C0003001 and this error should go away.

In our environment, we had C0003001 component as gateway, but we decided to deactivate it and tried to create new one, and this error shows up again. Now deleting any component or recreating gateway doesn't ensure C0003001 id. On gateway, we started getting error: Gateway is not ready to process requests.

To resolve the issue, we cleaned up the entire OWSM repository and reset the sequence as below:

delete from SERVICES;
delete from COMPONENT_POLICY_MAPPINGS;
delete from COMPONENTS;
drop SEQUENCE COMPONENT_ID_SEQ;
CREATE SEQUENCE "ORAWSM"."COMPONENT_ID_SEQ" MINVALUE 1 MAXVALUE 999999999999 INCREMENT BY 1 START WITH 3001 NOCACHE NOORDER CYCLE

After that if we recreate gateway as first component, it will have id of C0003001 and issue will be resolved right away.

Wednesday, December 22, 2010

XSL Refresh 10g

It is always good practice to have all XSD outside the BPEL and ESB project and refer them via http url (or oramds in 11g). All XSL remains in the BPEL or ESB, and changes to XSL requires redeployment.

We have lot of logic in XSL especially in AIA, and lot of XSL can be reused. We thought of exploring the option of doing real time changes to XSL.

Option 1: XSL embedded in BPEL process
If XSL is embedded in BPEL process, the XSL is cached inside the transformation engine. Fortunately, XSL function used by BPEl and ESB provides ways to initialize the cache. com.collaxa.cube.xml.xpath.functions.xml.GetElementFromXSLTFunction provide initializeCache function. So if you deploy a Java class in the same JVM as BPEL RT, then it can call this method and reinit the XSL cache. I did the same, created a Java class which calls GetElementFromXSLTFunction initializeCache and then exposed that piece of code as a WebService.

Here are detailed steps:

(one time) Deploy the the XSL Refresh web service (download) to same container as BPEL
Change the XSL from $ORACLE_HOME/bpel/domains/domain-name/tmp/.bpel.../.. as your need
go to: http://host:port/XSLTRefresh/RefreshXSLTPoolSoapHttpPort and hit invoke

This will ensure that new XSL changes will be in effect immediately. This approach came quite handy during development phase as I could avoid redeploying AIA processes for XSL changes and could have better turn around. Option 2: XSL on Apache Server I think cleaner way would be to Putting XSLs outside the BPEL project and deploy them separately from the BPEL and ESB deployment. E.g.

<from expression="ora:processXSLT('http://host:7777/xsl/Transform_IP_OP_Sample.xsl',bpws:getVariableData('inputVariable','payload'))"/>

BPEL XSL engine doesn't cache the XSL handle if it is http based, but you can use the xsl refresh webservice if you see any issue depending on BPEL version you are using.

Tuesday, December 21, 2010

IE8 with Oracle ESB 10g

We were having issue with ESB console when we upgraded from IE6 to IE8 browser. BPELConsole/AIA/OWSM Console just worked fine with IE8. If you use 10.1.3.5+, I think this issue is fixed, but for lower version of Oracle ESB you can use following to make it compatible with IE8 browser.

For ESB Console:

- Modify following files:

esb-dt-container\applications\esb-dt\esb_console\esb\widgets\InstancesView.js and
esb-dt-container\applications\esb-dt\esb_console\picasso\widgets\widgets.Perspective.js

- In these files look for a line this.myEvents = [""] and comment this line out with "//"