Skip to main content

Lars Holden

A synthetic reservoir model, known as the PUNQ-S3 case, is used to compare various techniques for quantification of uncertainty in future oil production when historical production data is available. Some results for this case have already... more
A synthetic reservoir model, known as the PUNQ-S3 case, is used to compare various techniques for quantification of uncertainty in future oil production when historical production data is available. Some results for this case have already been presented in an earlier paper. In this paper, we present some additional results for this problem, and also argue an interpretation of the results that is somewhat different from that presented in the earlier paper. The additional results are obtained with the following methods: (i) rejection sampling, (ii) history matching of multiple models using a pilot-point approach, and (iii) Markov Chain Monte Carlo (MCMC).
In this paper some numerical experiments with a recently proposed model for the flow of water, oil and gas in a porous medium are discussed. We consider the case of incompressible flow and neglect cappilary effects. In many oil reservoars... more
In this paper some numerical experiments with a recently proposed model for the flow of water, oil and gas in a porous medium are discussed. We consider the case of incompressible flow and neglect cappilary effects. In many oil reservoars the three phases are in contact with each other. The most important parameters describing the flow are the three-phase relative permeabilities. Unfortunately, these data are typkally very difficult to measure and is often not available.
The generation and systematic collection of genome-wide data is ever-increasing. This vast amount of data has enabled researchers to study relations between a variety of genomic and epigenomic features, including genetic variation, gene... more
The generation and systematic collection of genome-wide data is ever-increasing. This vast amount of data has enabled researchers to study relations between a variety of genomic and epigenomic features, including genetic variation, gene regulation and phenotypic traits. Such relations are typically investigated by comparatively assessing genomic co-occurrence. Technically, this corresponds to assessing the similarity of pairs of genome-wide binary vectors. A variety of similarity measures have been proposed for this problem in other fields like ecology. However, while several of these measures have been employed for assessing genomic co-occurrence, their appropriateness for the genomic setting has never been investigated. We show that the choice of similarity measure may strongly influence results and propose two alternative modelling assumptions that can be used to guide this choice. On both simulated and real genomic data, the Jaccard index is strongly altered by dataset size and should be used with caution. The Forbes coefficient (fold change) and tetrachoric correlation are less influenced by dataset size, but one should be aware of increased variance for small datasets. All results on simulated and real data can be inspected and reproduced at https://hyperbrowser.uio.no/sim-measure.
It is proved that the standard assumptions on the Buckley-Leverett equations for three-phase flow imply that the equation system is not strictly hyperbolic. Therefore, the solution of the Buckley-Leverett equations for three-phase flow is... more
It is proved that the standard assumptions on the Buckley-Leverett equations for three-phase flow imply that the equation system is not strictly hyperbolic. Therefore, the solution of the Buckley-Leverett equations for three-phase flow is very complicated. We also discuss four different models for the relative permeability. It is stated that Stone’s model almost always gives (an) elliptic region(s). Furthermore, it is proved that Marchesin’s model is hyperbolic under very weak assumptions. The triangular model is hyperbolic and the solution is well-defined and depends L1-continuously upon the initial values in the Riemann problem.
As a novel approach we will combine trajectories or longitudinal studies of gene expression with information on annual influenza epidemics. Seasonality of gene expression in immune cells from blood could be a consequence of within-host... more
As a novel approach we will combine trajectories or longitudinal studies of gene expression with information on annual influenza epidemics. Seasonality of gene expression in immune cells from blood could be a consequence of within-host seasonal immunity interacting with the seasonal pandemics of influenza (flu) in temperate regions and, thus, with potential valuable analogy transfer to the proposed seasonal development of covid-19.Here we operationalized within-host immunity as genes with both a significant seasonal term and a significant flu term in the sine-cosine model. Information on gene expression was based on microarray using RNase buffered blood samples collected randomly from a population-based cohort of Norwegian middle-aged women in 2003-2006, The Norwegian Women and Cancer (NOWAC) study. The unique discovery (N=425) and replication (N=432) design were based on identical sampling and preprocessing. Data on proportion of sick leaves due to flu, and the flu intensities per ...
BackgroundThe generation and systematic collection of genome-wide data is ever-increasing. This vast amount of data has enabled researchers to study relations between a variety of genomic and epigenomic features, including genetic... more
BackgroundThe generation and systematic collection of genome-wide data is ever-increasing. This vast amount of data has enabled researchers to study relations between a variety of genomic and epigenomic features, including genetic variation, gene regulation, and phenotypic traits. Such relations are typically investigated by comparatively assessing genomic co-occurrence. Technically, this corresponds to assessing the similarity of pairs of genome-wide binary vectors. A variety of metrics have been proposed for this problem in other fields like ecology. However, while several of these metrics have been employed for assessing genomic co-occurrence, their appropriateness for the genomic setting has never been investigated.ResultsWe show that the choice of metric may strongly influence results and propose two alternative modelling assumptions that can be used to guide this choice. On both simulated and real genomic data, the Jaccard index is strongly affected by dataset size and should ...
The application of sequence stratigraphy concepts to reservoir de scription involves the correlation of different types of (bounding) surfaces from well to well to produce a high resolution reservoir zonation. A stochastic model has been... more
The application of sequence stratigraphy concepts to reservoir de scription involves the correlation of different types of (bounding) surfaces from well to well to produce a high resolution reservoir zonation. A stochastic model has been developed for describing the geometry of different types of surfaces, and a reservoir zonation is constructed by simulating a number of surfaces from the base of the reservoir upwards. The surfaces are modelled as transformed Gaussian random fields. Conditioning on observed depths is performed by kriging, including inequality constraints for surfaces not observed in a well due to subsequent erosion. This paper focusses on the stochastic model for a particular type of surface containing erosional valleys. The valleys are modelled by fibre processes and correlated Gaussian random functions. Prior distributions for valley location and geometry are defined and updated to posterior distributions by simulating from the prior model conditioned on the observations. Information such as the depth of the boundaries observed in the wells, and the well pattern with respect to valley orientation, width and sinuosity, is thus utilized in the parameter inference.
The paper presents an application of a stochastic reservoir description simulator to the modelling of two sections of a coastal-deltaic reservoir, the Tarbert Formation in the Gullfaks Field. The description simulator, called DESIRÈ,... more
The paper presents an application of a stochastic reservoir description simulator to the modelling of two sections of a coastal-deltaic reservoir, the Tarbert Formation in the Gullfaks Field. The description simulator, called DESIRÈ, models reservoir architecture by an ordered sequence of widespread facies types on which are superimposed bodies or infinitely thin barriers. Petrophysical variables are simulated conditionally on architecture. The reservoirs display features of a coastline progradation and a sandflat sequence, interspersed with bodies of tidal origin and calcite cemented zones. Contoured- and flow unit models have been constructed for the reservoirs, and sets of realisations from the models have been simulated. Production simulations of water injection have been performed on all realisations. The production results show that with the data available for analysis, significant uncertainty remains in predicted behaviour irrespective of the comparatively simple structure of...
Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell... more
Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell types. Despite the high potential value of these publicly available data for a broad variety of investigations, little attention has been given to the analytical methodology necessary for their widespread utilisation. We here present a first principled treatment of the analysis of collections of genomic tracks. We have developed novel computational and statistical methodology to permit comparative and con rmatory analyses across multiple and disparate data sources. We delineate a set of generic questions that are useful across a broad range of investigations and discuss the implications of choosing different statistical measures and null models. Examples include contrasting analyses across different tissues or diseases. The methodology has been imple...
A multidimensial mixture model for unsupervised tail estimation Note no
We test whether there is a slight difference in the time development of the gene expression for a group of genes between two strata relative to time of diagnosis. This group of genes is only identified from all the other genes based on... more
We test whether there is a slight difference in the time development of the gene expression for a group of genes between two strata relative to time of diagnosis. This group of genes is only identified from all the other genes based on the observed gene expressions. We are able to show that there is a time development in the gene expression the last year before diagnosis. The opposite hypothesis is rejected with a p-value less than 0.001. It is also possible to analyze in a finer time resolution, but then there is more noise in the results due to the limited number of patients. The time development is most significant up to 6 months before diagnosis. For smaller quantiles (up to 100 genes) there is indication on significance up to 48 months, but the data is so scarce that single patients seem to influence the result. We estimate that the time development relative to time to diagnosis the first year is present in 600-1.600 genes. We are also able to make a prognosis on whether a pati...
Loading... ...
ABSTRACT We analyse MCMC chains focusing on how to find simulation parameters that give good mixing for discrete time, Harris ergodic Markov chains on a general state space X having invariant distribution π. The analysis uses an upper... more
ABSTRACT We analyse MCMC chains focusing on how to find simulation parameters that give good mixing for discrete time, Harris ergodic Markov chains on a general state space X having invariant distribution π. The analysis uses an upper bound for the variance of the probability estimate. For each simulation parameter set, the bound is estimated from an MCMC chain using recurrence intervals. Recurrence intervals are a generalization of recurrence periods for discrete Markov chains. It is easy to compare the mixing properties for different simulation parameters. The paper gives general advice on how to improve the mixing of the MCMC chains and a new methodology for how to find an optimal acceptance rate for the Metropolis-Hastings algorithm. Several examples, both toy examples and large complex ones, illustrate how to apply the methodology in practice. We find that the optimal acceptance rate is smaller than the general recommendation in the literature in some of these examples.
In this paper we analyze the solution of the Riemann problem for the following one-dimensional conservation law $$ \frac{\partial } {{\partial t}}\theta \left[ {\begin{array}{*{20}c} u \\ v \\ \end{array} } \right] + \left[... more
In this paper we analyze the solution of the Riemann problem for the following one-dimensional conservation law $$ \frac{\partial } {{\partial t}}\theta \left[ {\begin{array}{*{20}c} u \\ v \\ \end{array} } \right] + \left[ {\begin{array}{*{20}c} {\frac{1} {2}(\frac{{u^2 }} {2}) + v\rho v} \\ {u(v - \rho )} \\ \end{array} } \right] = 0 $$ .
Research Interests:
ABSTRACT Statistics is one of the paramount sciences of the 21st century because of the emergence of the value of data as a key asset for development. It is of fundamental importance to deliver statistical methods and tools for the... more
ABSTRACT Statistics is one of the paramount sciences of the 21st century because of the emergence of the value of data as a key asset for development. It is of fundamental importance to deliver statistical methods and tools for the progress of science and sustainable economic growth. Statistics for Innovation (sfi)2 has been, between 2007 and 2014, a centre of excellence providing new statistical methodology for selected Norwegian industries and service providers. In this paper, we trace the history of (sfi)2, describe the way it operated and the type of projects it has run over 8 years. We discuss challenges and the way we tried to overcome them. The purpose of this paper is to share our experience for similar centres, outline the advantages and challenges and indicate some recommendations. Copyright © 2015 John Wiley & Sons, Ltd.
A method of solving the saturation equations (first order hyperbolic conservation laws) by approximating the fractional flow function by a piecewise linear function has been developed. The method makes it possible to solve the saturation... more
A method of solving the saturation equations (first order hyperbolic conservation laws) by approximating the fractional flow function by a piecewise linear function has been developed. The method makes it possible to solve the saturation equations without stability problems. In this new front-tracking method, the pressure equation is solved by a finite element method. The grid for the pressure equation can therefore be fitted to the reservoir geometry with great flexibility. A reservoir simulator based on the new methods for the solution of saturation equations is under development. The simulator is able to simulate two-phase immiscible flow in two dimensions at present. Both gravity and compressibility are included, as well as heterogeneities in the geological data. The simulator has been used for field simulation problems. It has proved to be insensitive to grid orientation and numerical dispersion. In addition, for larger grid systems, the simulator is computationally more efficient than finite difference simulators. A simpler version of the front-tracking simulator can also be used as a streamline simulator. The streamline simulator has the same CPU-efficiency and front-tracking capabilities as traditional streamtube simulators without the need to transform the reservoir into a set of one-dimensional regions.
Background In active run-in trials, where patients may be excluded after a run-in period based on their response to the treatment, it is implicitly assumed that patients have individual treatment effects. If individual patient data are... more
Background In active run-in trials, where patients may be excluded after a run-in period based on their response to the treatment, it is implicitly assumed that patients have individual treatment effects. If individual patient data are available, active run-in trials can be modelled using patient-specific random effects. With more than one trial on the same medication available, one can obtain a more precise overall treatment effect estimate. Methods We present a model for joint analysis of a two-sequence, four-period cross-over trial (AABB/BBAA) and a three-sequence, two-period active run-in trial (AB/AA/A), where the aim is to investigate the effect of a new treatment for patients with pain due to osteoarthritis. Results Our approach enables us to separately estimate the direct treatment effect for all patients, for the patients excluded after the active run-in trial prior to randomisation, and for the patients who completed the active run-in trial. A similar model approach can be...
We will explain differnt methods for integrating horizon and fault models. An example, using previously developed horizon and fault models, will be given.
All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.
Loading... ...
Rebalancing of portfolios with a concave utility function is considered. It is proved that transaction costs imply that there is a no-trade region where it is optimal not to trade. For proportional transaction costs it is optimal to... more
Rebalancing of portfolios with a concave utility function is considered. It is proved that transaction costs imply that there is a no-trade region where it is optimal not to trade. For proportional transaction costs it is optimal to rebalance to the boundary when outside the no-trade region. With flat transaction costs, the rebalance from outside the no-trade region should be to a state at an internal surface in the no-trade region but never a full rebalance. The standard optimal portfolio theory is extended to n-symmetric assets, general utility function, and more general stochastic processes. Examples are discussed. The paper studies rebalancing of portfolios with a general concave utility function. It is proved that transaction costs imply that there is a no-trade region in the state space where it is not optimal to perform trading. If the transaction costs are proportional, then it is optimal to rebalance when outside the no-trade region to a state at the boundary of this region...
Objective This explorative study aimed to assess if there are any time-dependent blood gene expression changes during the first one to eight years after breast cancer diagnosis, which can be linked to the clinical outcome of the disease.... more
Objective This explorative study aimed to assess if there are any time-dependent blood gene expression changes during the first one to eight years after breast cancer diagnosis, which can be linked to the clinical outcome of the disease. Material and methods A random distribution of follow-up time from breast cancer diagnosis till blood sampling was obtained by a nested, matched case-control design in the Norwegian Women and Cancer Post-genome Cohort. From 2002–5, women were invited to donate blood samples, regardless of any cancer diagnosis. At end of the study period in 2015, any cancer diagnoses in the 50 000 participants were obtained via linkage to the Norwegian Cancer Registry. For each breast cancer patient (n = 415), an age- and storage time-matched control was drawn. The design gave a uniform, random length of follow-up time, independent of cancer stage. Differences in blood gene expression between breast cancer cases and controls were identified using the Bioconductor R-pa...
Norsk Regnesentral Norsk Regnesentral (Norwegian Computing Center, NR) is a private, independent, non‐profit foundation established in 1952. NR carries out contract research and development projects in information and communication... more
Norsk Regnesentral Norsk Regnesentral (Norwegian Computing Center, NR) is a private, independent, non‐profit foundation established in 1952. NR carries out contract research and development projects in information and communication technology and applied statistical‐mathematical modelling. The clients include a broad range of industrial, commercial and public service organisations in the national as well as the international market. Our scientific and technical capabilities are further developed in co‐operation with The Research Council of Norway and key customers. The results of our projects may take the form of reports, software, prototypes, and short courses. A proof of the confidence and appreciation our clients have in us is given by the fact that most of our new contracts are signed with previous customers. Abstract The analyses in this note are based on a dataset with gene expression in blood before diagnosis of breast cancer. The dataset consists of case‐control pairs that a...
Model-free knockoffs is a recently proposed technique for identifying covariates that is likely to have an effect on a response variable. The method is an efficient method to control the false discovery rate in hypothesis tests for... more
Model-free knockoffs is a recently proposed technique for identifying covariates that is likely to have an effect on a response variable. The method is an efficient method to control the false discovery rate in hypothesis tests for separate covariates. This paper presents a generalisation of the technique using multiple sets of model-free knockoffs. This is formulated as an open question in Candes et al. [4]. With multiple knockoffs, we are able to reduce the randomness in the knockoffs, making the result stronger. Since we use the same structure for generating all the knockoffs, the computational resources is far smaller than proportional with the number of knockoffs. We prove a bound on the asymptotic false discovery rate when the number of sets increases that is better then the published bounds for one set.
This article describes the development of the Norwegian Historical Population Register, which is the first open national register. In the period 1735–1964, 9.7 million people lived in Norway, and for them 37.5 million events (such as... more
This article describes the development of the Norwegian Historical Population Register, which is the first open national register. In the period 1735–1964, 9.7 million people lived in Norway, and for them 37.5 million events (such as birth, death, or migration) have been recorded in sources. We link together as many events as possible for the same persons and families, but only include links that have a high probability of being correct. The linking is performed by automatic methods and crowdsourcing. A national population register is important for migration research. It allows us to reconstruct (stepwise) internal migration in Norway, frequently followed by international migration from Norway, as well as return migration to Norway. Many non-Norwegian sources also specify place of birth by country, and this makes it possible to identify individuals in Norwegian sources.
Genome-wide, cell-type-specific profiles are being systematically generated for numerous genomic and epigenomic features. There is, however, no universally applicable analytical methodology for such data. We present GSuite HyperBrowser,... more
Genome-wide, cell-type-specific profiles are being systematically generated for numerous genomic and epigenomic features. There is, however, no universally applicable analytical methodology for such data. We present GSuite HyperBrowser, the first comprehensive solution for integrative analysis of dataset collections across the genome and epigenome. The GSuite HyperBrowser is an open-source system for streamlined acquisition and customizable statistical analysis of large collections of genome-wide datasets. The system is based on new computational and statistical methodologies that permit comparative and confirmatory analyses across multiple disparate data sources. Expert guidance and reproducibility are facilitated via a Galaxy-based web-interface. The software is available at https://hyperbrowser.uio.no/gsuite
The understanding of changes in temporal processes related to human carcinogenesis is limited. One approach for prospective functional genomic studies is to compile trajectories of differential expression of genes, based on measurements... more
The understanding of changes in temporal processes related to human carcinogenesis is limited. One approach for prospective functional genomic studies is to compile trajectories of differential expression of genes, based on measurements from many case-control pairs. We propose a new statistical method that does not assume any parametric shape for the gene trajectories. The trajectory of a gene is defined as the curve representing the changes in gene expression levels in the blood as a function of time to cancer diagnosis. In a nested case-control design it consists of differences in gene expression levels between cases and controls. Genes can be grouped into curve groups, each curve group corresponding to genes with a similar development over time. The proposed new statistical approach is based on a set of hypothesis testing that can determine whether or not there is development in gene expression levels over time, and whether this development varies among different strata. Curve gr...
Identification of three-dimensional (3D) interactions between regulatory elements across the genome is crucial to unravel the complex regulatory machinery that orchestrates proliferation and differentiation of cells. ChIA-PET is a novel... more
Identification of three-dimensional (3D) interactions between regulatory elements across the genome is crucial to unravel the complex regulatory machinery that orchestrates proliferation and differentiation of cells. ChIA-PET is a novel method to identify such interactions, where physical contacts between regions bound by a specific protein are quantified using next-generation sequencing. However, determining the significance of the observed interaction frequencies in such datasets is challenging, and few methods have been proposed. Despite the fact that regions that are close in linear genomic distance have a much higher tendency to interact by chance, no methods to date are capable of taking such dependency into account. Here, we propose a statistical model taking into account the genomic distance relationship, as well as the general propensity of anchors to be involved in contacts overall. Using both real and simulated data, we show that the previously proposed statistical test, ...

And 101 more