Everyone can feel their own pulse, but it takes a cardiologist to interpret all the data the beats contain, diagnose whatever physiological are present, and prescribe what to do about them. Likewise, many process industry players are calling in data scientists to find deeper details in their process signals, and adopting analytics software to streamline extracting useful intelligence that can improve decisions and add value.
For instance, Parkland Corp.’s Burnaby refinery near Vancouver, B.C., Canada, recently launched a self-service analytics program using Seeq’s data analytics software. Its engineers have been using the software for about a year and a half for daily monitoring and incident investigation of its main air blower (MAB), sitewide inferential modeling, boilers, wet gas compressor and flare stack, which helps their colleagues save time, alleviate low-value data cleaning tasks, and reduce organizational friction that analytics efforts can cause.
The 88-year-old refinery produces about 55,000 barrels per day, and meets about 25% of the province’s gasoline and diesel needs. It’s also working on expanding its range of bio-feedstocks such as tallow and canola oil to reduce it greenhouse gas (GHG) emissions and carbon footprint.
Filtering datasets
Siang Lim, data scientist at Parkland, reports that Burnaby’s first, self-service analytics project was conditional filtering of large process datasets. Lim and Sarah Mahmoud, process engineer at Parkland, presented “Self-service analytics for processing hydrocarbons” at Seeq’s 2022 Conneqt event in Austin, Tex.
“We’re trying to find and count the blips in our flow measurements across multiple years. It sounds easy, but this a surprisingly time-consuming and tedious process to do in Excel,” says Lim. “The solution was to use Seeq’s Value Search and Chain View functions to apply the right filters and hide irrelevant data to make it easier to gather the results we needed. By doing this in Seeq, we managed to complete the analysis in an hour, while previous attempts took more than 40 hours. The time saved also allowed us to do higher-value work on improving the quality of our investigation.”
Lim reports the refinery’s engineering team needed to know more about the blips and anomalies in its flow measurements because an earlier hazard and operability (HazOp) study found risks due to the MAB’s low-flow trip point being set too low. The engineers recommended a higher setpoint and tried to raise it. However, their operations colleagues pushed back because they’d already observed random dips in flow measurements, and were concerned that a higher setpoint could cause spurious trips in the unit or cause a process upset.
“Was that true? We didn’t know, so we had to do due diligence to find out how often these flow anomalies were happening before making further changes,” says Lim. “We wanted to find times where our feed rate was above a certain value, indicating the unit was running normally, and when airflow was below a certain value, indicating those flow anomalies.”
To find dips that might only last a few seconds and only occur every few weeks or months over five or 10 years, Lim states his team tried to use Excel, but it was too slow to apply its formulas and filters over millions of rows in such huge files. He adds the team sometimes found it difficult to build Aveva (formerly OSIsoft) PI System queries because they weren’t familiar with some of the query language, SQL or Python scripts. Plus, the initial spreadsheet attempt took 40 hours due to historian setting adjustments, questions about the data and resolution, and hitting Excel’s row limit, which required creating multiple files.
Fortunately, while conducting a trial with Seeq, Lim adds it was easy for the team to define composite conditions they needed, and adjust their investigation range to however many months or years they needed. This avoided the need to tweak historian retrieval settings, run if-else formulas in Excel, and filter millions of rows and multiple files—and enabled the conditional filtering attempt to succeed in just one hour.
“With Seeq, we rapidly and correctly identified all time periods with anomalies by using Chain View to hide the irrelevant datasets, and its analysis and results gave us the confidence to procced with setpoint changes,” says Lim. “This is important because a spurious trip would have cost about $1 million per day in downtime, while not proceeding with changes would’ve resulted in unacceptable safety risks.”
Inferential performance assessment
The Burnaby refinery’s second self-service analytics project was assessing inferential models—also known as soft sensors—which are used for estimating process variables without online analyzers or measurements. However, model performance must be monitored to address equipment or process changes, even though it’s difficult to align timestamps for predictions compared to lab values.
In this case, Seeq’s Capsules function easily realigns timestamps for all samples and automatically calculates predictions errors. This avoids manually aligning sample timestamps, which is also very tedious in Excel. The team also used Seeq’s Asset Trees and Asset Scripts to scale its calculations to multiple inferential models.
Lim reports the simplest way to measure model performance is the residual method, which uses the absolute difference between predictions and lab results, which is known as the absolute error or residuals.
“This sounds simple, but it’s hard because of the data cleaning part. It may take several hours to get lab results, and then what if you have to do it 10 or 100 times or for the whole year? This is where it becomes tedious,” says Lim. “Plus, how to do you know the time shift? We have an indicator function in the DCS, so when we take a sample, the function goes from zero to one, and goes back when we get results. We convert that indicator function into Seeq’s Capsules, use its properties to score those values, and align predictions with lab values. We display raw data in three lanes, including lab samples, predictions from the model and indicator functions.” (Figure 1)
Once lab and prediction signals are converted to scalars bounded by the indicator function’s capsules, Lim reports that differences in predictions versus lab results are easy to calculate as a Seeq Formula. This aligns the predictions and lab results, and presents them on a cleaner, two-line “after-realigning” graph. This means the lab and prediction data is clean and ready for residual calculations, which Seeq displays as a simple, one-line bargraph.
“The results of doing inferential calculations in Seeq are that we avoided manually aligning timestamps in Excel by leveraging capsule properties,” adds Lim. “We also used these calculations as a template. This makes it easy to scale them to other inferentials by defining them as Asset Trees groups. This was a huge win for us to do these calculations in a more automated way.
“Simple tasks can be tedious without the right tools. Depending on the problem they’re trying to solve, users often don’t need or want machine learning or advanced algorithms. With just the ability to quickly access, filter and clean our data. Seeq makes it easy to do this, which empowers our engineers to quickly get the results they need.”