Source data of the North Sea well inventory:
United Kingdom (UK) - Oil and Gas Authority (Dec. 2018) - https://data-ogauthority.opendata.arcgis.com/datasets/oga-wells-ed50
Contains information provided by the OGA.
Wells are extracted for the area of the PGS data set PGS Mega Survey Plus.
We measured the distance between all wells of the test group (n = 43) and all those who are within the seismic data set (n = 1,792; presented here) and their closest bright spot with polarity reversal. Furthermore, we calculated the mean RMS amplitudes and RMS amplitude standard deviation for a buffer radius of 300 m around the well paths for all wells inside the seismic data set and the visited wells as 300 m is the distance below which all of the visited wells of the test group showed gas release in form of flares from the seafloor.
We test, if the propensity of a well to leak can be identified by using a logistic regression, which includes regressors such as well activity data and/or derived parameters such as mean RMS amplitude and mean RMS amplitude standard deviation, the distance towards the most proximal bright spot with polarity reversal and age (spud date).
In order to identify the most suitable regressor combination best subset selection is employed. The main selection criterion chosen was the prediction accuracy from randomly and repeatedly splitting the visited wells into a training and a test set and then using the fitted logistic regression to predict the test data. The most suitable subset turns out to only employ the distance to polarity reversal, producing a prediction accuracy of 89% and the following logistic regression results: In order to obtain confidence intervals using the normal distribution the distance to bright spot with polarity reversal has to be normally distributed, which it is not. Yet it can be transformed to normality by adding 100 meters to the original distance and then taking the natural logarithm:
Logistic regression fit for leakage of all visited wells using distance to bright spot with polarity reversal in meters as a regressor. Please find further information on the applied statistical analyses in the supplementary material.
Estimate Std. Error z value Pr(>|z|) Significance
Intercept 4,853.946 1,735.128 2.797 0.00515 0.01
Distance −0.007361 0.002700 −2.726 0.00640 0.01
The transformed logistic regression model is then used to predict the probabilities of leakage for the wells within our seismic data set in the Central North Sea (here presented data).
In order to obtain confidence bands this logistic regression is performed subtracting and adding two standard deviations from the calculated probability. The point estimate predicts leakage from 926 of the 1,792 wells, where the 95% confidence interval ranges from 719 to 1,058.