Uploaded by Luis enrique navarro

Data Science and Ml applied to drilling

advertisement
Data Science & Machine
Learning applied on Petroleum
Engineering :
Stick & Slip focus
Luis Enrique Navarro Morales
Why ?
Provide the driller useful information ahead of time about drillstring vibration (Stick & Slip), improving the
drilling process, developing and using the available data from other wells, also by applying techniques that
its results gets better the more data they get.
How ?
Using real time data, offered by equinor; ML techniques that has been used in other fields but every
computer might do
What ?
Avoid damages to the diverse tools used while drilling a wellbore.
The present work aims to be a thesis proposal in petroleum engineering, and its syllabus.
Chapter 1.
MWD
Chapter 2.
Data science and Machine Learning
Chapter 3.
Stick & Slip
Chapter 4 .
Case of Study and model construction
Artificial Intelligence, aims to used data to impart human-like
decision making to machines, mimicking human behavior;
Machine learning is defined as a subset of AI technique, which
uses statistical methods to enable machines improve with
experience, or use data to make optimized inferences and
predictions.
While Deep learning is a subset of ML, dedicated to filter inputs
through layers (Neural networks), to learn how to predict or
classify.
In order to build reliable Machine Learning models, a series of
disciplines must be understood and well managed.
Data is the factual information, a measurement used as a basis
for reasoning, discussion or calculation, in the drilling field,
data is taken through sensors in the drillstring, rotary table
and tools that form the drill system (Hoisting, hydraulics, etc)
Features are a prominent part or characteristic or something,
in the case at hand, features will be the drill mechanics of the
wellbore
Algorithms, are the procedures to solve a problem ir
accomplish some end.
Data mining and Data science, gaining the data from different
sources and the understanding of such data, respectively, are
the building blocks fot ML algorithms, the programming will
rely on a language.
Two branches can be defined in the ML
repertoire:
- Supervised :
This algorithms need solutions or
labels, which the model has to
predict, using a classification
solution to get a categorical
solution, or a regression problem,
which predicts numbers
- Unsupervised:
No explicit solutions are given to
the model, it need to analyze
data and find patterns.
Stick & Slip
In the drilling process, the bit, the drillstring and the
wellbore may interact in a way creating unwanted
vibrations:
- Bit Bounce : Motion that cause the bit to
repeatedly lift-off and impact the formation, via
WOB fluctuations
- Bending: lateral motion that cause the drillstring to
shock the wellbore wall
- Bit Whirl: Eccentric rotation of the bit, deviated
from its geometric center
- Stick & Slip : non uniform surface-bit rotation
The stick and Slip (S&S) is a result of wellbore friction, the
wrong combination of top drive (rpm) and weight on bit
(swob), resulting in the sudden stick of the bit, and an
acceleration of the bit while slipping, hence Stick & Slip.
This can cause operational problems (low ROP, stuck drillstring,
etc) and damaging the components of the BHA.
Due to this sudden and increased acacceleration, damages may
occur to the bit, connection overtorque, interference with mud
telemetry, etc.
When the rotary torque applied to the bit is insufficient for its
rotation, a momentarily stick is caused followed by a release
and an acceleration that might be beyond the material limit.
The “STICK_RT” is obtained by subtracting the maximum downhole rpm minus the minimum rpm, this process is done by
the MWD tool of the BHA, which has measuring gauges in the collar of the BHA
S&S severity percentage is a measure of how much more the BHA is
rotating compared with the surface RPM; if the rotational speed is
bigger than the limit of the tool, damages are presented.
Stable drilling
The S&S solution relies on the interaction between WOB and RPM,
field solutions such as decreasing Weight on bit, or increasing
RPMs, mitigation procedures have bee design by companies like
Schlumberger; tools such as the OMNI Roller Reamer, have proven
being useful to mitigate the S&S, with its own problems, such as
cutting displacements.
The measurements show in this are from
MWD, tool, where the S&S is the
diference or the space between the
green line (Max RPM) and the blue line
(Min RPM), the tool gives as Stick & Slip
Real Time, just the diference.
Also shown, other types of drillstring
vibrations such as shocks.
As seen from the S&S, continuos
medium values made the TD be
modified, several solutions where
applied to decrease the S&S
- SWOB
+ RPM
Data gathering
In 2018, the company equinor, released a large amount of datasets regarding the development of the Volve Field.
The data needed to be analyzed and cleaned in order to be use in the construction of a model
The programming language, in which the present work is coded is Python.
This high level programming language, its very useful for Data Science techniques,
and Machine Learning implementations, a lot of support from the community to
better the libraries, which are the tools to interpret and decode files.
The first step towards the selection of a drilling section to apply and implement a model, is decode the WITSML
(Well Site information Transfer Standard Markup Language), this are files which contained the well-site data from
the rig to different stakeholders in the oil & gas industry.
The trajectories from a given well are transmitted using this scheme in each survey of the drilling process, python
library Beautifulsoup let us decode this xml files, an algorithm was developed to create a dataframe or data
structure that contained the survey data and we could plot this trajectories using python plotting libraries
Each survey is taken with its corresponding section,
plotting each section in a deviation plot, we can see the
extent of each section and its deviation, in the image
shown, we see the drilling sections mentioned on the
End of Well Report provided in the documents
As seen from the deviations, the sections with
color green and blue, corresponding to the hole
sizes 12 ¼ “ and 8 ½ “, respectively, are the ones
that show the must deviation from the 45° line.
To analyze the whole trajectory, we need to use the xml nodes
regarding NorthSouth and EastWest coordinates, and plot
them using specialized plotting libraries such as plotly, which
give us the hability to create 3D and dynamic plotting.
Must be taken into account, the drilling process and the planned trajectories may differ from the actual
wellbore trajectory due to rock/bit interaction, BHA vibrations and other mechanisms
The drilling mechanics of each section are contained in the LAS files, corresponding to MWD Data, as shown below
Basically, there are two types of LAS or well logs files, those who are indexed by time, and those who are
indexed by Depth..
Because the purpose of the Machine Learning model is to predict ahead of time, the used scheme will be
DateIndexed, arising the problem of support from well known libraries; to solve this issue an algorithm was
developed to construct dataframes from this type of files, the data that will be contained in this data structures
will be drilling mechanics, data obtained from the BHA memory and surface gauges.
The logging captabilities may differ from log to log given the tool used in the BHA, this tools are
mentioned in the End of Well Report, and averages obtained from the files, in the understanding that
each file stores a different run in the drilling process.
Available runs
The data obtained from this files can be plotted in a continuos form, such as logs are presented, in this tracks the most common are
the ones measured on surface such as:
-
Block Position (BPOS), it shows the position in the hoisting system of the block, the ups and downs given by the drilling process
and final put out of hole (POOH)
Block Velocity (BVEL), related to the hoisting system as well, it indicates the velocity of the block in its trip
Hook Load (HKLD), the measurement of the weight of the drillstring
Depth (DEPT), the measured depth of the bit, not necessarily indicating drilling processs
Torque (TQA), the rotational force between the drillstring and formation, given by the stiffness of the drillstrings, topdrive or mud
motor
Surface Weight on bit (SWOB), the force exerted in the bit due to the drillstring and its weight, the measurement is taken in the
surface and correlated with the hookload
Rotations Per Minute (RPM), the rotations per minutes of the topdrive, which is transmited in the surface to the drillstring
Rate Of Penetration (ROP), a measurement of how much depth or the drilling speed in an hour
Drilling remarks from the F-14 Wellbore
Run : 3
Data Science analysis : Pairplot
Tool that plots
each variable
against each
other and
shows the
histogram
Pearson’s coefficient between each drilling parameter
A straight line,
indicating a direct
correlation between
annulus pressure and
depth
As formations are
drilled, a recognizeable
pattern emerges from
the gamma ray log
As depth is gained, the
temperature raises in a
constant slope, while
POOH the temperature
diminishes rapidly
While drilling, piping
must be connected, this
process is seen in the
load
The WOB applied to the
drilling sections, must
remain in a defined
range, TIH and POOH
might deviate this trend
S&S is present in the
drilling process, mostly
on the top and bottom
S&S is constant and
more often whith low
ROP
ROP decreases due a
change in RPM, this
might might has
been caused by a
remedial for S&S
Circulating bottoms up or
cleaning the hole
POOH process
Run : 4
RIH process
Top of cement is located, and then drilled, making the S&S measurement have
abnormal behavior due to casing shoe
Run : 5
Full pairplot analysis
The gamma ray
response in the shallow
section, is well defined,
while deeper section,
ranges considerably
*Mud type()
Gaining depth, the
amount of weight
supported increases,
and in-between trend is
lost
The wide response
should be modified
once the drilling
process is selected.
The rpm range needed
to drill remains
constant in the sections
The major S&S
ocurrences happen on
the 12 1/4” section, but
its length is relatively
short than other
sections
Lower hookload equals
more S&S in the
shallowest section, the
same cant be tell for
further sections
?
As seen above, the statistical properties of the S&S value ranges mostly in the
amount of outliers, this caused by the quartiles of each section:
- 17.5 : most of the S&S values are below 50
- 12.25 : the range widens and end up at 100
- 8.5 : the range almost stays the same, but the maximum value decreases
The maximum value of the S&S measure is reached in the three sections
Once the analysis was done, the selected
section to apply the ML algorithm to predict the
S&S severity is the12 1/4inch section
Start drilling depth
Continuous up/down from the topdrive and block,
gradual increase of rpm until the appropriate is
reached, mud pumps are stable and recognizable the
joint for next sections of drillstring,
formation drilling process begin
TOC recognition and cement drilling
As seen in the previuos slides, the
dataframe contains the whole run
of the BHA, inside this run, cleaning
process, trips and other non-drilling
procedures, the next step was to
clean this data points, and to keep
only the drilling process, an
algorithm was developed to achieve
this goal, applying the next
concepts :
- HDTV: hole depth
- BONB : Bit on Bottom
- Flag : drilling or not
Schlumberger Drilling Reference propose the
next classification for S&S ocurrances, a
categorical value will be added to the data
structure, in order to train the ML model to
predict such categorical severity, an algorithm
was developed to achieve this process
Categorical
severity
0
1
2
3
Once the dataframe was subjected
to all we’ve done, will be easier to
see the S&S ocurrences
Machine Learning model construction
As we are using a supervised
ML algorithm, the prediction
must filled, and its predictors
as well.
One option would be deleting
the rows which contains any
missing value, the second one,
would be appliying a ML
algorithm know as “KNN” to
impute the missing values, this
method will also be applied
later
To avoid collinearity, and
the eliminate parameter
which provide the same info
The Pearson correlation factor
must be contained in this
interval
To train the ML model, splitting the original dataframe is required:
- Training set: Set of examples used for the ML learning, fitting the parameters of the predictors
- Testing set: subset of the dataframe dedicated to asses the performance of the ML model
Decision Tree Classifier
The tree is constructed by asking a
series of questions or decision node
about the dataset at hand, each time
an answer is received a follow up
question is asked until a conclusion
about the class label or leaf node is
reached.
Existing algorithms exist to create the
trees, bein the CART (Classification And
Regression Trees), the must used; this
trees are the milestone to the
ensamble or bagging algorithms, in
which the Random Forest Algorithm
offers good results.
random_state : value that controls the randomness
of the algorithm, for replicable results
max_leaf_nodes : number of Decision nodes
First question asked
Depth 0
How pure the leaf node is ?
> 0 : samples contained belong
to different classes
Depth 1
Number of samples in the node, as it’s the
root node, the whole training set
Depth 3
How many number of samples belong to
each class
The prediction a given node will make
Depth 4
Depth 5
Based on the ML model, scores can be
assign to input features or predictors
based on how useful they are at predicting
a target variable.
As seen on the image below, the accuracy obtained if predicting the S&S with only 2 predictor (SWOB & RPM)
has a value of 83%, and visualizing the results, the severe cases of S&S aren’t predicted.
Classification metrics:
True Positive Rate (TPR):
Also known as recall and sometimes
Sensitivity, the probability of a value been
classificated correctly
False Positive Rate (FPR):
The probability of a value being misclassified
in the model
ROC (Reciever Operating Characteristics):
Describe the trade-off between the TPR and
FPR along different probability thresholds for
the classifier
From the image, we can see that the model achieve high Recall values with little misclassified values
(FPR), giving an area under the curve (AUC) of .913, which is a very good value, taking into account
only two predictors were givem
Random Forest Classifier
The prediction of this algorithm is
obtained by a majority vote over the
predictions of individual decision trees,
if given a regression problem, averages
will be calculated
Numero de nucleos empleado para el entrenamiento,
“-1” se utilizan todos los nucleos disponibles
Out Of Bag error, validation of a RF model
Number of trees in
the forest
The main contributors for severe S&S are
RPMs and SWOBs, schlumberger solution
algorithm rely heavily on modifying this
parameter to avoid further S&S severity, the
model obtained rely mostly on TurbineRPM
and Torque, Total Flow and then SWOB.
The severity prediction, in the image above shows the 96% accuracy of the model.
For an ideal classifier, the AUC is
the area of a rectangle with a unit
value.
Once the model is constructed
and analyzed entirely, its
predictions will be correct for the
section.
In the previous pages, Ive shown the potential benefits of Machine Learning
Algorithms can bring to the drilling engineering subject, its really important to note that
this models will increase its applicability, once the dataframe grows, further feature
engineering will be needed to teach the model to keep learning to predict the S&S
severity, but for the scope of this work, only one section will be used and analyzed.
Download