This article reports the findings of the Workshop on High Performance
Computing and Communications (HPCC) for Grand Challenge Applications:
Computer Vision, Speech and Natural Language Processing (SNLP), and Artificial
Intelligence (AI). The goals of the workshop are to identify applications,
research problems, and designs of HPCC systems for supporting applications
in these areas. In computer vision, we have identified the main scientific issues as
machine learning, surface reconstruction, inverse optics and integration,
model acquisition, and perception and action. Since vision algorithms
operate in different levels of granularity, computers for supporting these
algorithms need to be heterogeneous and modular. Advances in technology,
new architectural concepts, and software design methods are essential for
this area. In SNLP, we have identified issues in statistical analysis in
corpus-based speech and language understanding, search strategies for language
analysis, auditory and vocal-tract modeling, integration of multiple levels
of speech and language analyses, and connectionist systems. Similar to
algorithms in computer vision, algorithms in SNLP require high
computational power, ranging from general purpose supercomputing to special pur-
pose VLSI systems. As processing has various requirements, a hybrid
heterogeneous computer system is the most desirable. In AI, important issues that need immediate attention include the
development of efficient machine learning and heuristic search methods that
can adapt to different architectural configurations, and the design and
construction of scalable and verifiable knowledge bases, active memories,
and artificial neural networks. A computer system for supporting AI
applications is heterogeneous, requiring research in high speed computer networks, mass storage and efficient retrieval methods, computer languages,
and hardware and software design tools. Research in these areas is inherently multidisciplinary and will
require active participation of researchers in device and networking
technologies, signal processing, computer architecture, software engineering,
and knowledge engineering. Besides extending current frontiers in
research, an important aspect to be emphasized is the integration of
existing components and results into working systems.
In this paper, we have studied various supervised learning methods for
training feed-forward neural networks. In general, such learning can be
considered as a nonlinear global optimization problem in which the goal is
to minimize a nonlinear error function that spans the space of weights,
using heuristic strategies that look for global optima (in contrast to
local optima). We have surveyed various global optimization methods
suitable for neural-network learning, and have proposed the Novel method
(Nonlinear global Optimization method Via External Lead) for nonlinear
optimization and neural network learning. By combining global and local
searches, we show how Novel can be used to find a good local minimum in
the error space. Our key idea is to use a user-defined trace that pulls
a search out of a local minimum without having to restart it from a new
starting point. Using five benchmark problems, we have compared Novel
against some of the best global optimization algorithms and have
demonstrated its superior improvement in performance. The key points studied and summarized in this paper are as follows:
(a) Global optimization involves strategies to escape from local
minima after reaching there,
(b) A good global search strategyy can find better local minima,
(c) The search space of small neural networks is more rugged, requiring
more powerful global search methods to get convergence,
(d) A better local minimum in the error space does not always lead to
neural networks that generalize better,
(e) Smaller networks tend to generalize better.
Sun Sparc object code training software and results
In this paper, we propose the concept of regression cubes in which regression models can be
obtained for any data cuboid efficiently without accessing the raw data. The traditional classification
of measures into distributive, algebraic, and holistic measures cannot accommodate advanced
statistical models such as regression and filters. In this paper, we propose a fundamentally new
class of measures, compressible measures, in order to support efficient computation of the statistical
models. For compressible measures, we can efficiently compute the statistical models at an
arbitrary data cell, while each data cell is compressed into a small auxiliary matrix with a fixed
size independent of the number of tuples. We can then compute the statistical measures for any
data cell from the compressed data of the lower-level cells without accessing the raw data. Time-
and space-efficient lossless aggregation formulae are derived for regression and filtering measures.
Our analytical and experimental studies show that the proposed approach substantially reduces
the memory usage and the overall response time for statistical analysis of multidimensional data.
Efficient scheduling techniques of computing resources
are essential for achieving satisfactory performance for
users as computer systems and their applications become more complex.
In this paper, we survey research on scheduling algorithms,
review previous classifications of scheduling problems,
and present a broader classification scheme.
Using a uniform terminology for scheduling strategies
and the new classification scheme, previous
work on scheduling strategies is reviewed,
and trends in scheduling research are identified.
Finally, a methodology for developing scheduling strategies
is presented.
In this paper, we present the design of a system for automatically
learning and evaluating new heuristic methods that can be used to map a set
of communicating processes on a network of computers. Our learning system
is based on testing a population of competing heuristic methods within a
fixed time constraint. We develop and analyze various resource scheduling
strategies based on a statistical model that trades between the number of
new heuristic methods considered and the amount of testing performed on
each. We implement a prototype learning system (Teacher 4.1) for learning
new heuristic methods used in Post-Game Analysis, a system that iteratively
generates and refines mappings of a set of communicating processes on a
network of computers. Our performance results show that a significant
improvement can be obtained by a systematic exploration of the space of
possible heuristic methods.
In this paper we present new methods for the automated learning of
heuristics in knowledge-lean applications and for finding heuristics that can be
generalized to unlearned domains. These applications lack domain knowledge
for credit assignment; hence, operators for composing new heuristics are
generally model-free, domain independent, and syntactic in nature. The
operators we have used are genetics-based; examples of which include
mutation and cross-over. Learning is based on a generate-and-test paradigm
that maintains a pool of competing heuristics, tests them to a limited
extent, creates new ones from those that perform well in the past, and
prunes poor ones from the pool. We have studied three important issues in
learning better heuristics: (a) anomalies in performance evaluation, (b)
rational scheduling of limited computational resources in testing candidate
heuristics in single-objective as well as multi-objective learning, and (c)
finding heuristics that can be generalized to unlearned domains. We show
experimental results in learning better heuristics for (a) process
placement for distributed-memory multicomputers, (b) node decomposition in a
branch-and-bound search, (c) generation of test patterns in VLSI circuit
testing, and (d) VLSI cell placement and routing.
In this paper, we model search algorithms with meta-control, allowing
resource constraints, approximation, and parallel processing to be
incorporated easily in the search process. The basic building block of the
model is a hierarchical search process (HSP) consisting of context-free and
context-sensitive grammars classified according to problem-independent and
problem-dependent parts. The context-sensitive components are used mainly
for evaluating decision parameters and in ordering production rules in the
context-free grammar. The execution of the grammars for given initial
conditions may invoke other HSPs already defined in the system. We describe
ISE (acronym for Integrated Search Environment), a tool that implements
hierarchical searches with meta-control. By separating the
problem-dependent and problem-independent components in ISE, new search methods
based on a combination of existing methods can be developed easily by
coding a single master control program. Further, new applications solved by
searches can be developed by coding the problem-dependent parts and reusing
the problem-independent parts already developed. We describe the organization of ISE and present experiments carried out on the system.
In this paper, we study a new learning model for designing heuristics
automatically under resource constraints. We focus on improving
performance-related heuristic methods (HMs) in knowledge-lean application
domains. We assume that learning is episodic, that the performance
measures of an episode are dependent only on the final state reached in
evaluating the corresponding test case and are independent of the
intermediate decisions and internal states, and that the aggregate performance
measures of the HMs involved are independent of the order of evaluation of
test cases. The learning model is based on testing a population of
competing HMs for an application problem and switches from one to another
dynamically, depending on the outcome of previous tests. Its goal is to find a
good HM within the resource constraints, with proper tradeoff between cost
and quality. It complements existing point-based machine learning models
that maintain only one incumbent HM, and that tests the HM extensively
before switching to alternative ones. It extends existing work on classifier
systems by addressing issues related to delays in feedback, scheduling of
tests of HMs under limited resources, anomalies in performance evaluation,
and scalability of HMs. Finally, we describe our experience in applying
the learning method on five application problems.
In this paper, we develop new methods for adjusting configuration
parameters of genetic algorithms operating in a noisy environment. Such methods
are related to the scheduling of resources for tests performed in genetic
algorithms. Assuming that the population size is given, we address two
problems related to the design of efficient scheduling algorithms
specifically important in noisy environments. First, we study the duration-scheduling
problem that is related to setting dynamically the duration of
each generation. Second, we study the sample-allocation problem that
entails the adaptive determination of the number of evaluations taken from
each candidate in a generation. In our approach, we model the search
process as a statistical selection process and derive equations useful for
these problems. Our results show that our adaptive procedures improve the
performance of genetic algorithms over that of commonly used static ones.
Two-level pipelining in processor arrays (PAs) involves pipelining of
operations across processing elements (PEs) and pipelining of operations in
functional units in each PE. Although it is an attractive method for
improving the throughput of PAs, existing methods for generating PAs with
two-level pipelining are restricted and cannot systematically explore the
entire space of feasible designs. In this paper, we extend a systematic
design method, called General Parameter Method (GPM), we have developed
earlier to find optimal designs of PAs with two-level pipelines. The basic
idea is to add new constraints on periods of data flows to include the
effect of internal functional pipelines in the PEs. As an illustration, we
present pipelined PA designs for computing matrix products. For $n$-dimensional meshes and other symmetric problems, we provide an efficient
scheme to obtain a pipelined PA from a non-pipelined PA using a reindexing
transformation. This scheme is used in GPM as a pruning condition to
arrive at optimal pipelined PAs efficiently. For pipelines with minimum initiation
interval (MII) greater than unity, we show additional constraints
that ensure correctness of the synthesized PAs.
In this paper, we have studied the scheduling of computational
resources for genetic algorithms operating in a noisy environment.
Assuming that the number of candidates in each generation is fixed, we
have studied two related scheduling problems: determining a suitable
duration for a generation, and finding the appropriate number of
samples to be drawn from each candidate. Our results show that dynamic
scheduling policies perform better that existing static ones.
Although we have used simple test functions in our experiments, the
advantage of using dynamic scheduling is universal and is independent
of the reproduction mechanism of individual genetic algorithms.
Processor arrays are frequently used to deliver high performance in many
applications with computationally intensive operations. This paper
presents the General Parameter Method (GPM), a systematic parameter-based
approach for synthesizing such algorithm-specific architectures. GPM can
synthesize processor arrays of any lower dimension from a
uniform-recurrence description of the algorithm. The design objective is a general
non-linear and non-monotonic user-specified function, and depends on
attributes such as computation time of the recurrence on the processor array,
completion time, load time, and drain time. In addition, bounds on some or
all of these attributes can be specified. GPM performs an efficient search
of polynomial complexity to find the optimal design satisfying the
user-specified design constraints. As an illustration, we show how GPM can be
used to find optimal linear processor arrays for computing transitive
closures. We consider design objectives that minimize computation time, or
processor count, or completion time (including load and drain times), and
user-specified constraints on number of processing elements and/or
computation/completion times. We show that GPM can be used to obtain
optimal designs that trade between number of processing elements and
completion time, thereby allowing the designer to choose a design that best meets
the specified design objectives. We also show the equivalence between the
model assumed in GPM and that in the popular dependence-based methods.
Consequently, GPM can be used to find optimal designs for both models.
In this paper, we present the design of a processor array (PA) with bounded
number of input/output (I/O) ports (L-by-N PEs with O(L) I/O ports, where
1 leq L leq N) for computing transitive closures. Previous designs of PAs to
solve this problem use either N-by-N PAs with O(N) I/O ports or linear PAs
with O(1) I/O ports. These designs are restricted by their I/O bandwidth
because designs using O(N) I/O ports are limited by the number of PEs that
can be supported by these ports, whereas designs with O(1) ports are
limited by the I/O time to load and drain data. In contrast, our proposed
design uses O(L) I/O ports, where L is a technology-dependent design
parameter. In our proposed architecture, we have used more powerful processing
elements (PEs), each with a microprogram-driven ALU, a limited number of
I/O buffers and O(N/L) memory words. Our technique for mapping
transitive-closure computations consists of four steps. (1) Starting from
a mapping of a transitive-closure problem on an N-by-N PA, we map the
computations into an L-by-N PA. (2) We then schedule the computations in the
L-by-N PA in order to satisfy the dependencies. (3) Based on the new
mapping, we derive speeds and directions of data flows and the number of
buffers required in each PE. (4) Finally, we derive the microprogram in
each PE for correct execution. We show that our design specializes to
efficient designs in the boundary cases when L=1 (linear PA) and L=N (square
mesh).
In this paper, we study the performance of various IDA*-style searches and
investigate methods to improve their performance by predicting in each
stage the threshold to be used for pruning. Without loss of generality, we
consider minimization problems in this paper. We first present three
models to approximate the distribution of the number of search nodes by
lower bounds: exponential, geometric, and linear, and illustrate these
distributions based on some well-known combinatorial search problems. Based on
these distributions, we show the performance of an ideal IDA* algorithm and
identify reasons why existing IDA*-style algorithms perform well. In
practice, we will be able to know from experience the type of distribution for
a given problem instance, but will not be able to know the parameters of
this distribution until the instance is solved. Hence, we develop RIDA*, a
method that estimates dynamically the parameters of the distribution, and
predicts the best threshold to be used in each stage. Finally, we compare
the performance of several IDA*-style algorithms --- Korf's IDA* and RBFS,
RIDA*, IDA*_CR and DFS* --- on several application problems, and identify
conditions under which each of these algorithms will perform well.
This paper describes Dynamic Workload Generator (DWG), a facility for
generating realistic and reproducible synthetic workloads for use in
load-balancing experiments. For such experiments, the generated workload must
not only mimic the highly dynamic resource-utilization patterns found on
today's distributed systems but also behave as a real workload does when
test jobs are run concurrently with it. The latter requirement is
important in testing alternative load-balancing strategies, a process that
requires running the same job multiple times, each time at a different site
but under an identical network-wide workload. Parts of DWG are implemented
inside the operating-system kernel and have complete control over the
utilization levels of four key resources: CPU, memory, disk, and network.
Besides accurately replaying network-wide load patterns recorded earlier,
DWG gives up a fraction of its resources each time a new job arrives and
reclaims these resources upon job completion. The latter operation is
controlled by pattern-doctoring rules implemented in DWG. We present DWG's
architecture, its doctoring rules, systematic methods for adjusting and
evaluating doctoring rules, and experimental results on a network of Sun
workstations.
In this paper we present a method called Novel (Nonlinear Optimization via
External Lead) for solving continuous and discrete global optimization
problems. Novel addresses the balance between global search and local
search, using a trace to aid in identifying promising regions before
committing to local searches. We discuss Novel for solving continuous
constrained optimization problems and show how it can be extended to solve
constrained satisfaction and discrete satisfiability problems. We first
transform the problem using Lagrange multipliers into an unconstrained
version. Since a stable solution in a Lagrangian formulation only
guarantees a local optimum satisfying the constraints, we propose a global
search phase in which an aperiodic and bounded trace function is added to
the search to first identify promising regions for local search. The trace
generates an information-bearing trajectory from which good starting points
are identified for further local searches. Taking only a small portion of
the total search time, this elegant approach significantly reduces
unnecessary local searches in regions leading to the same local optimum.
We demonstrate the effectiveness of Novel on a collection of continuous
optimization benchmark problems, finding the same or better solutions while
satisfying the constraints. We extend Novel to discrete constraint
satisfaction problems (CSPs) by showing an efficient transformation method
for CSPs and the associated representation in finite-difference equations
in Novel. We apply Novel to solve Boolean satisfiability instances in
circuit fault detection and circuit synthesis applications, and show
comparable performance when compared to the best existing method.
In this paper, we develop TCGD, a problem-independent, time-constrained,
approximate guided depth-first search (GDFS) algorithm. The algorithm is
designed to achieve the best ascertained approximation degree under a fixed
time constraint. We consider only searches with finite search space and
admissible heuristic functions. We study NP-hard combinatorial
optimization problems with polynomial-time computable feasible solutions.
For the problems studied, we observe that the execution time increases
exponentially as approximation degree decreases, although anomalies may
happen. The algorithms we study are evaluated by simulations using the
symmetric traveling-salesperson problem.
In this paper, we present two learning mechanisms for artificial neural
networks (ANNs) that can be applied to solve classification problems with
binary outputs. These mechanisms are used to reduce the number of hidden
units of an ANN when trained by the cascade-correlation learning algorithm
(CAS). Since CAS adds hidden units incrementally as learning proceeds, it
is difficult to predict the number of hidden units required when
convergence is reached. Further, learning must be restarted when the number of
hidden units is larger than expected. Our key idea in this paper is to
provide alternatives in the learning process and to select the best
alternative dynamically based on run-time information obtained. Mixed-mode
learning (MM), our first algorithm, provides alternative output matrices so
that learning is extended to find one of the many one-to-many mappings
instead of finding a unique one-to-one mapping. Since the objective of
learning is relaxed by this transformation, the number of learning epochs
can be reduced. This in turn leads to a smaller number of hidden units
required for convergence. Population-based Learning for ANNs (PLAN), our
second algorithm, maintains alternative network configurations and to
select at run time promising networks to train based on error information
obtained and time remaining. This dynamic scheduling avoids training
possibly unpromising ANNs to completion before exploring new ones. We show
the performance of these two mechanisms by applying them to solve the
two-spiral problem, a two-region classification problem, and the Pima Indian
Diabetes Diagnosis problem.
In this paper, we present new results on the automated generalization of
performance-related heuristics learned for knowledge-lean applications. By
first applying genetics-based learning to learn new heuristics for some
small subsets of test cases in a problem space, we study methods to
generalize these heuristics to unlearned subdomains of test cases. Our method
uses a new statistical metric called probability of win. By assessing the
performance of heuristics in a range-independent and distribution-independent manner, we can compare heuristics across problem subdomains in
a consistent manner. To illustrate our approach, we show experimental
results on generalizing heuristics learned for sequential circuit testing,
VLSI cell placement and routing, branch-and-bound search, and blind
equalization. We show that generalization can lead to new and robust heuristics
that perform better than the original heuristics across test cases of
different characteristics.
Problem solvers employ strategies in searching for solutions to given problem instances.
Strategies have traditionally been designed by experts using prior knowledge and refined
manually using trial and error. Recent attempts to automate these processes have produced
strategy-learning systems. This paper shows how various issues in strategy learning
are affected by the nature of performance tasks, problem solvers, and learning environments.
Surveyed learning systems are grouped by the commonality of their approaches into four
general architectures.
Satisfiability is a class of NP-complete problems that model a wide range
of real-world applications. These problems are difficult to solve because
they have many local minima in their search space, often trapping greedy
search methods that utilize some form of descent. In this paper, we
propose a new discrete Lagrange-multiplier-based global-search method for
solving satisfiability problems. We derive new approaches for applying
Lagrangian methods in discrete space, show that equilibrium is reached when
a feasible assignment to the original problem is found, and present
heuristic algorithms to look for equilibrium points. Our method and
analysis provides a theoretical foundation and generalization of local
search schemes that optimize the objective alone and clause-weight schemes
that optimize the constraints alone. In contrast to local search methods
that restart from a new starting point when a search reaches a local trap,
the Lagrange multipliers in our method provide a force to lead the search
out of a local minimum and move it in the direction provided by the
Lagrange multipliers. In contrast to clause-weight schemes that rely only
on the weights of violated constraints to escape from local minima, our
method also uses the value of an objective function (in this case the
number of violated constraints) to provide further guidance. The dynamic
shift in emphasis between the objective and the constraints, depending on
their relative values, is the key of Lagrangian methods. One of the major
advantages of our method is that it has very few algorithmic parameters to
be tuned by users, and the search procedure can be made deterministic and
the results, reproducible. We demonstrate our method by applying it to
solve an extensive set of benchmark problems archived in DIMACS of Rutgers
University. Our method often performs better than the best existing
methods and can achieve an order-of-magnitude speedup for some problems.
Dynamic load-balancing strategies for distributed systems seek to improve
average completion time of independent tasks by migrating each incoming
task to the site where it is expected to finish the fastest: usually the
site having the smallest load index. SMALL is an off-line learning system
for developing configuration-specific load-balancing strategies; it learns
new load indices as well as tunes the parameters of given migration
policies. Using a dynamic workload generator, a number of typical
systemwide load patterns are first recorded; the completion times of
several benchmark jobs are then measured at each site, under each of the
recorded load patterns. These measurements are used to simultaneously train
comparator neural networks, one per site. The comparators collectively
model a set of perfect load indices in that they seek to rank, at arrival
time, the possible destinations for an incoming task by their (not yet
known) respective completion times. The numerous parameters of the
decentralized dynamic load-balancing policy are then tuned using a genetic
algorithm. We present experimental results for a mix of scientific and
interactive workloads on Sun workstations connected by Ethernet. The
policies tuned by SMALL are shown to intelligently and effectively exploit
idle resources.
In this paper, we study the design of a coprocessor (CoP) to execute
efficiently recursive algorithms with uniform dependencies. Our design is
based on two objectives: (a) fixed bandwidth to main memory (MM) and (b)
scalability to higher performance without increasing MM bandwidth. Our CoP
has an access unit (AU) organized as multiple queues, a processor array
(PA) with regularly connected processing elements (PEs), and input/output
networks for data routing. Our design is unique because it addresses
input/output bottleneck and scalability, two of the most important issues
in integrating processor arrays in current systems. To allow processor
arrays to be widely usable, they must be scalable to high performance with
little or no impact on the supporting memory system. The use of multiple
queues in AU also eliminates the use of explicit data addresses, thereby
simplifying the design of the control program. We present a mapping
algorithm that partitions a data dependence graph (DG) of an application
into regular blocks, sequences the blocks through AU, and schedules the
execution of the blocks, one at a time, on PA. We show that our mapping
procedure minimizes the amount of communication between blocks in the
partitioned DG, and sequences the blocks through AU to reduce the
communication between AU and MM. Using the matrix-product and
transitive-closure applications, we study design trade-offs involving (a) division of
a fixed chip area between PA and AU, and (b) improvements in speedup with
respect to increases in chip area. Our results show, for a fixed chip
area, (a) that there is little degradation in throughput in using a linear
PA as compared to a PA organized as a square mesh, and (b) that the design
is not sensitive to the division of chip area between PA and AU. We
further show that, for a fixed throughput, there is an inverse square root
relationship between speedup and total chip area. Our study demonstrates
the feasibility of a low-cost, memory bandwidth-limited, and scalable
coprocessor system for evaluating recurrent algorithms with uniform
dependencies.
In this paper, we present a new method to handle inequality constraints and
apply it in Novel (Nonlinear Optimization via External Lead), a system we
have developed for solving constrained continuous nonlinear optimization
problems. In general, in applying Lagrange-multiplier methods to solve
these problems, inequality constraints are first converted into equivalent
equality constraints. One such conversion method adds a slack variable to
each inequality constraint in order to convert it into an equality
constraint. The disadvantage of this conversion is that when the search is
inside a feasible region, some satisfied constraints may still pose a
nonzero weight in the Lagrangian function, leading to possible oscillations
and divergence when a local optimum lies on the boundary of a feasible
region. We propose a new conversion method called the MaxQ method such
that all satisfied constraints in a feasible region always carry zero
weight in the Lagrange function; hence, minimizing the Lagrange function in
a feasible region always leads to local minima of the objective function.
We demonstrate that oscillations do not happen in our method. We also
propose methods to speed up convergence when a local optimum lies on the
boundary of a feasible region. Finally, we show improved experimental
results in applying our proposed method in Novel on some existing benchmark
problems and compare them to those obtained by applying the method based on
slack variables.
Lagrangian methods are popular in solving continuous constrained optimization
problems. In this paper, we address three important issues in applying
Lagrangian methods to solve optimization problems with inequality constraints.
First, we study methods to transform inequality constraints into equality constraints.
An existing method, called the slack-variable method, adds a slack variable to each
inequality constraint in order to transform it into an equality constraint.
Its disadvantage is that when the search trajectory is inside a feasible region,
some satisfied constraints may still pose some effect on the Lagrangian function,
leading to possible oscillations and divergence when a local minimum lies on the
boundary of the feasible region. To overcome this problem, we propose the
MaxQ method that carries no effect on satisfied constraints.
Hence, minimizing the Lagrangian function in a feasible region always
leads to a local minimum of the objective function. We also study some
strategies to speed up its convergence.
Second, we study methods to improve the convergence speed
of Lagrangian methods without affecting the solution quality.
This is done by an adaptive-control strategy that dynamically adjusts the
relative weights between the objective and the Lagrangian part, leading to
better balance between the two and faster convergence.
Third, we study a trace-based method to pull the search trajectory from
one saddle point to another in a continuous fashion without restarts.
This overcomes one of the problems in existing Lagrangian methods that converges
only to one saddle points and requires random restarts to look for new saddle points,
often missing good saddle points in the vicinity of saddle points already found.
Finally, we describe a prototype Novel (Nonlinear Optimization via External
Lead) that implements our proposed strategies and present improved solutions in
solving a collection of benchmarks.
In this paper, we define the generalization problem, summarize
various approaches in generalization, identify the credit assignment
problem, and present the problem and some solutions in measuring
generalizability. We discuss anomalies in the ordering of hypotheses
in a subdomain when performance is normalized and averaged, and show
conditions under which anomalies can be eliminated. To generalize performance
across subdomains, we present a measure called probability of win
that measures the probability whether a hypothesis is better than another.
Finally, we discuss some limitations in using probabilities of win and
illustrate their application in finding new parameter values for
TimberWolf, a package for VLSI cell placement and routing.
In this paper, we present a new discrete Lagrangian method
for designing multiplierless QMF (quadrature mirror filter) banks.
The filter coefficients in these filter banks are in powers-of-two
(PO2), where numbers are represented as sums or differences of powers of two
(also called Canonical Signed Digit--CSD--representation), and multiplications
are carried out as additions, subtractions and shifts.
We formulate the design problem as a nonlinear discrete constrained
optimization problem, using reconstruction error as the objective,
and stopband and passband energies, stopband and passband ripples
and transition bandwidth as constraints.
Using the performance of the best existing designs as constraints,
we search for designs that improve over the best existing designs
with respect to all the performance metrics.
We propose a new discrete Lagrangian method for finding good designs,
and study methods to improve the convergence speed of Lagrangian
methods without affecting their solution quality.
This is done by adjusting dynamically the relative weights between
the objective and the Lagrangian part.
We show that our method can find designs that improve over
Johnston's benchmark designs using a maximum of three to six ONE
bits in each filter coefficient instead of using floating-point representations.
Our approach is general and is applicable to
the design of other types of multiplierless filter banks.
In this paper, we present a new search method based on the theory of discrete Lagrange multipliers
for designing multiplierless PR (perfect reconstruction) LP (linear phase) filter banks.
To satisfy the PR constraints, we choose a lattice structure that, under
certain conditions, can guarantee the resulting two filters to be a PR pair.
Unlike the design of multiplierless QMF filter banks that represents
filter coefficients directly using PO2 (powers-of-two) form (also called
Canonical Signed Digit or CSD representation), we use PO2 forms to represent
the parameters associated with the lattice structure.
By representing these parameters as sums or differences of powers of two,
multiplications can be carried out as additions, subtractions, and shifts.
Using the lattice representation, we decompose the
design problem into a sequence of four subproblems.
The first two subproblems find a good starting point with continuous
parameters using a single-objective, multi-constraint formulation.
The last two subproblems first transform the continuous solution found by the
second subproblem into a PO2 form, then search for a design in a mixed-integer space.
We propose a new search method based on the theory of discrete
Lagrange multipliers for finding good designs, and study methods to improve
its convergence speed by adjusting dynamically the relative weights between
the objective and the Lagrangian part.
We show that our method can find good designs using at most four terms in PO2 form
in each lattice parameter.
Our approach is unique because our results are the first successful designs of
multiplierless PR-LP filter banks. It is general because it is applicable to the
design of other types of multiplierless filter banks.
In this paper, we explore the loss behavior encountered in
transmitting real-time voice over the Internet and propose a
new loss-concealment scheme to improve its received quality.
One known technique to conceal loss is to send interleaved
streams of voice samples and reconstruct missing or late
samples by interpolation at the receiver. Based on this
method, we propose a new transformation-based reconstruction
algorithm. Its basic idea is for the sender to transform an
input voice stream, according to the interpolation method
used at the receiver and the predicted loss behavior, before
interleaving the stream. The transformation is derived by
minimizing reconstruction error in case of loss. We show
that our method is computationally efficient and can be
extended to various interleaving factors and interpolation-
based reconstruction methods. Finally, we show performance
improvements of our method by testing it over the Internet.
Nonlinear constrained optimization problems in discrete and continuous
spaces are an important class of problems studied extensively in operations
research and artificial intelligence.
These problems can be solved by a Lagrange-multiplier method in continuous
space and by an extended discrete Lagrange-multiplier method in discrete space.
When constraints are satisfied, these methods rely on gradient
descents in the objective space to find high-quality solutions.
On the other hand, when constraints are violated, these methods rely on
gradient ascents in the Lagrange-multiplier space in order to increase the
penalties on unsatisfied constraints and to force these constraints into satisfaction.
The balance between gradient descents and gradient ascents depends on the
relative weights between the objective function and the constraints, which indirectly
control the convergence speed and solution quality of the Lagrangian method.
To improve convergence speed without degrading solution quality, we
propose an algorithm to dynamically control the relative weights between the
objective and the constraints. Starting from an initial weight, the algorithm
automatically adjusts the weights based on the behavior of the search progress.
With this strategy, we are able to eliminate divergence, reduce oscillations,
and speed up convergence. We show improved convergence behavior of our proposed
algorithm on both nonlinear continuous and discrete problems.
This paper studies various strategies in constrained simulated annealing (CSA),
a global optimization algorithm that achieves asymptotic convergence to
constrained global minima (CGM) with probability one for solving discrete
constrained nonlinear programming problems (NLPs).
The algorithm is based on the necessary and sufficient condition for discrete
constrained local minima (CLM) in the theory of discrete Lagrange multipliers
and its extensions to continuous and mixed-integer constrained NLPs.
The strategies studied include adaptive neighborhoods, distributions to
control sampling, acceptance probabilities, and cooling schedules.
We report much better solutions than the best-known solutions in the literature
on two sets of continuous benchmarks and their discretized versions.
In this paper we study the solution of SAT problems formulated as
discrete decision and discrete constrained optimization problems.
Constrained formulations are better than traditional unconstrained formulations because violated
constraints may provide additional forces to lead a search towards a satisfiable assignment.
We summarize the theory of extended saddle points in penalty formulations for solving discrete
constrained optimization problems and the associated discrete penalty method (DPM).
We then examine various formulations of the objective function, choices of neighborhood in
DPM, strategies for updating penalties, and heuristics for avoiding traps.
Experimental evaluations on hard benchmark instances pinpoint that traps contribute
significantly to the inefficiency of DPM and force a trajectory to repeatedly visit the
same set of or nearby points in the original variable space.
To address this issue, we propose and study two trap-avoidance strategies.
The first strategy adds extra penalties on unsatisfied clauses inside a trap,
leading to very large penalties for unsatisfied clauses that are trapped more
often and making these clauses more likely to be satisfied in the future.
The second strategy stores information on points visited
before, whether inside traps or not, and avoids visiting points that are close to points visited before.
It can be implemented by modifying the penalty function in such a way that, if a trajectory gets close
to points visited before, an extra penalty will take effect and force the trajectory to a new region.
It specializes to the first strategy because traps are special cases of points visited before.
Finally, we show experimental results on evaluating benchmarks in the DIMACS and SATLIB
archives and compare our results with existing results on GSAT, WalkSAT, LSDL, and Grasp. The results
demonstrate that DPM with trap avoidance is robust as well as effective for solving hard SAT problems.
Packet and compression losses are two sources of quality losses when streaming
compressed video over unreliable IP networks, such as the Internet.
In this paper, we propose two new approaches for concealing such losses.
First, we present a joint sender-receiver approach for designing transforms in multi-description
coding (MDC).
In the receiver, we use a simple interpolation-based reconstruction algorithm,
as sophisticated concealment techniques cannot be employed in real time.
In the sender, we design an optimized reconstruction-based discrete cosine transform
(ORB-DCT) with an objective of minimizing the mean squared error, assuming that some of the
descriptions are lost and that the missing information is reconstructed by simple averaging
at the destination. Second, we propose an artificial neural network
to compensate for compression losses introduced in MDC.
Experimental results show that our proposed algorithms perform well in real Internet tests.
Time-series predictions by artificial neural networks (ANNs) are traditionally
formulated as unconstrained optimization problems. As an unconstrained
formulation provides little guidance on search directions when a search
gets stuck in a poor local minimum, we have proposed to use a constrained
formulation in order to use constraint violations to provide additional
guidance. In this paper, we formulate ANN learning with cross-validations
for time-series predictions as a non-differentiable nonlinear constrained
optimization problem. Based on our theory of Lagrange multipliers for discrete
constrained optimization, we propose an efficient learning algorithm, called
violation guided back-propagation (VGBP), that computes an approximate
gradient using back-propagation (BP), that introduces annealing to avoid
blind acceptance of trial points, and that applies a relax-and-tighten
(R&T) strategy to achieve faster convergence. Extensive experimental
results on well-known benchmarks, when compared to previous work, show
one to two orders-of-magnitude improvement in prediction quality, while
using less weights.
In this paper, we present a new approach for designing filter banks for image compression.
This approach has two major components: optimization and generalization.
In the optimization phase, we formulate the design problem as a nonlinear optimization
problem whose objective consists of both the performance metrics of the image coder,
such as the peak signal-to-noise ratio (PSNR), and those of individual filters.
Filter banks are optimized in the optimization phase based on a set of training images.
In the generalization phase, the filter bank that can be generalized to other images
is selected from the candidates obtained in the optimization phase to be the final result.
The filter bank selected should perform well not only on the training examples used in
the design process but also on test cases not seen.
In contrast to existing methods that design filter banks independently from the other
operations in an image compression algorithm, our approach allows us to find filter
banks that work best in a specific image compression algorithm for a certain class of images.
In system prototype development, we adopt the agent-based approach to achieve better modularity,
portability, and scalability. Agents in the multi-agent system are specialized in performing
problem formulation, image compression, optimization, and generalization. In the experiments,
we show that on a set of benchmark images our approach has found filter banks that perform
better than the existing filter banks in different image compression algorithms and at different
compression ratios.
Quality-delay trade-offs can be made in transmitting subband-coded
images in the Internet by using either the TCP or the UDP protocol.
Delivery by TCP gives superior decoding quality but with very long
delays when the network is unreliable, whereas delivery by UDP has
negligible delays but with degraded quality when packets are lost.
Although images are delivered primarily by TCP today, we study in this paper the use
of UDP to deliver multi-description reconstruction-based subband-coded images and the
reconstruction of missing information at the receiver based on information received.
We first determine empirically the interleaving factors that should be used in
order to keep the probability of unrecoverable packet losses sufficiently small.
Next, we propose a joint sender-receiver approach for designing
transforms in multi-description subband coding.
In the receiver, we use a simple interpolation-based reconstruction algorithm,
as sophisticated concealment techniques cannot be employed in practice.
In the sender, we design an optimized reconstruction-based subband
transform (ORB-ST), with an objective of minimizing the mean squared error,
assuming that some of the descriptions are lost and that the missing
information is reconstructed by simple averaging at the destination.
Experimental results show that our proposed ORB-ST performs well in real Internet
tests, and UDP delivery of MDC images is an attractive alternative to TCP delivery.
A fundamental issue in real-time interactive voice transmissions over
unreliable IP networks is the loss or late arrival of packets for playback.
This problem is especially serious when transmitting low bit rate-coded
speech with pervasive dependencies introduced.
In this case, the loss or late arrival of a single packet
will lead to the loss of subsequent dependent frames.
In this paper, we study end-to-end loss-concealment
schemes for ensuring high quality in playback.
We propose a novel multiple description-coding method for
concealing packet losses in transmitting low bit rate-coded speech.
Based on high correlations observed in linear predictor parameters--in the form
of Line Spectral Paris (LSPs)--of adjacent frames, we generate
multiple descriptions in senders by interleaving LSPs, and
reconstruct lost LSPs in receivers by linear interpolations.
As excitation codewords have low correlations, we further enlarge the
segment size for excitation generation and replicate excitation codewords
in all the descriptions in order to maintain the same transmission bandwidth.
Our proposed scheme can be extended easily to more than two descriptions
and can adapt its number of descriptions dynamically to network-loss conditions.
Experimental results on FS-1016 CELP, ITU G.723.1, and
FS MELP coders show good performance of our scheme.
At the 2001 IEEE International Conference on Data Mining in San Jose, California on
November 29 - December 2, 2001, there was a panel discussion on how data mining research
meets practical development. One of the motivations for organizing the panel discussion
was to provide useful advice for industrial people to explore their directions in data
mining development. Based on the panel discussion, this paper presents the views and
arguments from the panel members, the Conference Chair and the Program Committee Co-Chairs.
These people as a group have both academic and industrial experiences in different data
mining related areas such as databases, machine learning, and neural networks. We will
answer such as (1) how far is data mining from practical development, (2) how data mining
research differs from practical development, and (3) what are the most promising areas in
data mining for practical development?
This paper presents a procedural framework that unifies various
mechanisms to look for discrete-neighborhood saddle points in
solving discrete constrained optimization problems (DCOPs).
Our approach is based on the necessary and sufficient condition on local
optimality in discrete space, which shows the one-to-one correspondence between the discrete-space
constrained local minima of a problem and the saddle points of the corresponding Lagrangian function.
To look for such saddle points, we study various mechanisms for performing ascents of the Lagrangian
function in the original-variable subspace and descents in the Lagrange-multiplier subspace.
Our results show that CSAEA, a combined constrained simulated
annealing and evolutionary algorithm, performs well when using
mutations and crossovers to generate trial points and accepting them
based on the Metropolis probability. We apply iterative deepening to
determine the optimal number of generations in CSAEA and show that
its performance is robust with respect to changes in population
size. To test the performance of the procedures developed, we apply them
to solve some continuous and mixed-integer nonlinear programming (NLP)
benchmarks and show that they obtain better results
than those of existing algorithms.
In this paper, we study the partitioning of constraints in temporal plan-
ning problems formulated as mixed-integer nonlinear programming (MINLP)
problems. Constraint partitioning is attractive because it leads to much
easier subproblems, where each is a significant relaxation of the original
problem. Moreover, each subproblem is very similar to the original problem
and can be solved by any existing solver with little or no modification.
Constraint partitioning, however, introduces global constraints that may be
violated when subproblems are evaluated independently. To reduce the over-
head in resolving such global constraints, we develop in this paper new
conditions and algorithms for limiting the search space to be backtracked
in each subproblem. Using a penalty formulation of a MINLP where the con-
straint functions of the MINLP are transformed into non-negative functions,
we present a necessary and sufficient extended saddle-point condition (ES-
PC) for constrained local minimization. When the penalties are larger than
some thresholds, our theory shows a one-to-one correspondence between a
constrained local minimum of the MINLP and an extended saddle point of the
penalty function. Hence, one way to find a constrained local minimum is to
increase gradually the penalties of those violated constraints and to look
for a local minimum of the penalty function using any existing algorithm
until a solution to the constrained model is found. Next, we extend the
ESPC to constraint-partitioned MINLPs and propose a partition-and-resolve
strategy for resolving violated global constraints across subproblems. Us-
ing the discrete-space ASPEN and the mixed-space MIPS planners to solve
subproblems, we show significant improvements on some planning benchmarks,
both in terms of the quality of the plans generated and the execution times
to find them.
``Data Mining: How Research Meets Practical Development,''
Knowledge and Information Systems: An International Journal,
Springer-Verlag, vol. 5, no. 2, April 2003, pp. 248-261.
We study in this paper the partitioning of the constraints of a temporal planning
problem by subgoals, their sequential evaluation before parallelizing the actions,
and the resolution of inconsistent global constraints across subgoals.
Using a Lagrangian formulation and the theory of extended saddle points,
we propose a global-search strategy that looks for local minima in the
original-variable space of the Lagrangian function and for local maxima in the
Lagrange-multiplier space. Our approach improves over a previous scheme that
partitions constraints along the temporal horizon. The previous scheme leads to many global
constraints that relate states in adjacent stages, which means that an incorrect
assignment of states in an earlier stage of the horizon may violate a global constraint
in a later stage of the horizon. To resolve the violated global constraint in
this case, state changes will need to propagate sequentially through multiple stages, often
leading to a search that gets stuck in an infeasible point for an extended period of time.
In this paper, we propose to partition all the constraints by subgoals and to add new global constraints in
order to ensure that state assignments of a subgoal are consistent with those in other subgoals.
Such an approach allows the information on incorrect state assignments in one subgoal to propagate quickly
to other subgoals. Using MIPS as the basic planner in a partitioned implementation, we demonstrate significant
improvements in time and quality in solving some PDDL2.1 benchmark problems.
``Subgoal Partitioning and Global Search for Solving Temporal Planning Problems in Mixed Space,''
Int'l J. of Artificial Intelligence Tools, World Scientific Publishing Co. Pte. Ltd.,
vol. 13, no. 4, Dec. 2004, pp. 767-790.
For packet video, information loss and bandwidth limitation are two factors that affect video playback quality.
Traditional rate allocation approaches have focused on optimizing video quality under bandwidth constraint alone.
However, in the best-effort Internet, packets carrying video data are susceptible to losses, which need to be
reconstructed at the receiver side. In this paper, we propose loss aware rate allocations in both group-of-block
(GOB) level and macroblock level, given that certain packets are lost during transmissions and reconstructed
using simple interpolation methods at the receiver side. Experimental results show that our proposed algorithms
can produce videos of higher quality when sent over lossy Internet.
``Loss Aware Rate Allocations in H.263 Coded Video Transmissions,''
Journal of Circuits, Systems, and Computers, World Scientific Publishing Co. Pte. Ltd.,
vol. 14, no. 6, Dec. 2005. pp. 1157-1171.
Real-time surveillance systems, telecommunication systems, and other dynamic environments often
generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple
times. Much of such data resides at rather lowlevel of abstraction, whereas most analysts are interested in relatively
high-level dynamic changes (such as trends and outliers). To discover such high-level characteristics, one may need
to perform on-line multi-level, multi-dimensional analytical processing of stream data. In this paper, we propose
an architecture, called stream cube, to facilitate on-line, multi-dimensional, multi-level analysis of stream data.
For fast online multi-dimensional analysis of stream data, three important techniques are proposed for efficient
and effective computation of stream cubes. First, a tilted time frame model is proposed as a multi-resolution model
to register time-related data: the more recent data are registered at finer resolution, whereas the more distant data
are registered at coarser resolution. This design reduces the overall storage of time-related data and adapts nicely
to the data analysis tasks commonly encountered in practice. Second, instead of materializing cuboids at all levels,
we propose to maintain a small number of critical layers. Flexible analysis can be efficiently performed based on
the concept of observation layer and minimal interesting layer. Third, an efficient stream data cubing algorithm
is developed which computes only the layers (cuboids) along a popular path and leaves the other cuboids for
query-driven, on-line computation. Based on this design methodology, stream data cube can be constructed and
maintained incrementally with a reasonable amount of memory, computation cost, and query response time. This
is verified by our substantial performance study.
Stream data cube architecture facilitates online analytical processing of stream data. It also forms 36 a preliminary
data structure for online stream data mining. The impact of the design and implementation of stream data cube in
the context of stream data mining is also discussed in the paper.
``The Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams,''
Distributed and Parallel Databases: Special Issue on Data Warehousing and Data Streams,
Springer-Verlag, Sept. 2005.
In this paper, we present the partitioning of mutual-exclusion (mutex) constraints in
temporal planning problems and its implementation in the SGPlan planner. Based on the strong
locality of mutex constraints observed in many benchmarks of the Fourth International Planning Competition (IPC4),
we propose to partition the constraints of a planning problem into groups based on their subgoals.
Constraint partitioning leads to significantly easier subproblems that are similar to the original problem and
that can be efficiently solved by the same planner with some modifications to its objective function.
We present a partition-and-resolve strategy that looks for locally optimal subplans in constraint-partitioned
temporal planning subproblems and that resolves those inconsistent global constraints across the subproblems.
We also discuss some implementation details of SGPlan, which include the resolution of
violated global constraints, techniques for handling producible resources, landmark analysis,
path finding and optimization, search-space reduction, and modifications of Metric-FF when
used as a basic planner in SGPlan. Last, we show results on the sensitivity of each
of these techniques in quality-time trade-offs and experimentally demonstrate that SGPlan
is effective for solving the IPC3 and IPC4 benchmarks.
``Temporal Planning using Subgoal Partitioning and Resolution in SGPlan,''
J. of Artificial Intelligence Research,
AAAI Press, vol. 26, Aug. 2006, pp.323-369.
A temporal flexible planning problem that involves contingent and
requirement events can be formulated as a simple temporal network
with uncertainty (STNU). An STNU is controllable when there
is a strategy for executing the requirement events (or actions) in
such a way that all the conditions involving contingent events
can be satisfied in all situations. The most interesting and
useful controllability property is dynamic controllability in
which the remaining actions in an STNU can always be scheduled
under all possible feasible durations of future contingent events when all the
past contingent events are known. In this paper, we propose and study a
novel problem of assigning bounds on the duration of each requirement
link in order for the resulting STNU to be dynamically controllable and
to minimize the total cost over the allowed durations of all requirement links.
We first prove the NP hardness of the problem with a
linear cost function. We then formulate the dynamic
controllability of an STNU as the constraints in a nonlinear
optimization problem. Finally, we present methods for reducing the
number of constraints in order to make the problem tractable and
to demonstrate the computational performance of our methods.
``Optimization of Bounds in Temporal Flexible Plans with Dynamic Controllability,''
Int'l J. of Artificial Intelligence Tools, World Scientific Publishing Co. Pte. Ltd.,
Feb. 2007, vol. 16, no. 1, pp. 17-44.
As OLAP engines being widely used nowadays to support multidimensional data analysis and
intelligent decision making, it is desirable to support in data cubes advanced statistical measures,
such as regression and filtering measures, in addition to the traditional simple measures such as
count and average. Such advanced measures will allow users to model, smooth, and predict the
trends and patterns of data. Although raw data is generally collected at the lowest level, an analyst
is often more interested in generating higher and multiple levels of abstraction without accessing
the raw data. Efficient computation of aggregated statistical measures in multidimensional spaces
is a challenging problem as existing aggregation algorithms for simple distributive and algebraic
measures are inadequate for such computation.
``Regression Cubes with Lossless Compression and Aggregation,''
IEEE Trans. on Knowledge and Data Engineering,
vol. 18, no. 12, Dec. 2006, pp. 1585-1599.
In this paper, we study strategies in incremental planning for ordering and grouping subproblems partitioned
by the subgoals of a planning problem. To generate a rich set of partial orders for ordering subproblems,
we propose an algorithm based on a relaxed plan that ignores the delete lists. The new algorithm considers
both the initial and the goal states and can effectively order subgoals in such a way that greatly reduces the
number of invalidations during incremental planning. We have also considered trade-offs between the granularity
of the subgoal sets and the complexity of solving the overall planning problem. We propose an efficient strategy
for dynamically adjusting the grain size in partitioning in order to minimize the total complexity.
We further evaluate a redundant-ordering scheme that uses two different subgoal orders to improve the solution
quality, without greatly sacrificing run-time efficiency. Experimental results on using Metric-FF, YAHSP, and
LPG-TD-speed as the embedded planners in incremental planning show that our strategies are general for improving
the time and quality of these planners across various benchmarks. Finally, we compare the performance of the
three planners, the incremental versions using these planners as embedded planners, and SGPlan4.1.
``Subgoal Ordering and Granularity Control for Incremental Planning,''
Int'l J. of Artificial Intelligence Tools, World Scientific Publishing Co. Pte. Ltd.,
vol. 16, no. 4, Aug. 2007, pp. 707-723.
Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have
developed two major approaches, top-down versus bottom-up. The former, represented by the MultiWay Array Cube (called the
MultiWay) algorithm, aggregates simultaneously on multiple dimensions; however, it cannot take advantage of a priori pruning
when computing iceberg cubes (cubes that contain only aggregate cells whose measure values satisfy a threshold, called the
iceberg condition). The latter, represented by BUC, computes the iceberg cube bottom-up and facilitates a priori pruning.
BUC explores fast sorting and partitioning techniques; however, it does not fully explore multidimensional simultaneous
aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous two
algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure,
extends the simultaneous aggregation methods, and enables the pruning of the group-bys that do not satisfy the
iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms the previous methods.
IEEE Trans. on Knowledge and Data Engineering,
vol. 19, no. 1, Jan. 2007, pp. 111-126.
In this paper, we develop a new constrained artificial-neural-network (ANN) formulation and the
associated learning algorithm for predicting stock prices, a difficult time-series prediction problem.
We characterize daily stock prices as a noisy non-stationary time
series and identify its predictable low-frequency components.
Using a recurrent FIR (finite-impulse-response) ANN, we formulate the learning problem
as a constrained optimization problem, develop constraints for incorporating
cross validations, and solve the learning problem using algorithms based on the
theory of extended saddle points for nonlinear constrained optimization.
Finally, we illustrate our prediction results on ten stock-price time series.
Our main contributions in this paper are the channel-specific low-pass filtering of noisy time series
obtained by wavelet decomposition, the transformation of the low-pass signals to improve their stationarity,
and the incorporation of constraints on cross validation that can improve the accuracy of predictions.
Our experimental results demonstrate good prediction accuracy and annual returns.
``Constrained Formulations and Algorithms for Predicting Stock Prices by Recurrent FIR Neural Networks,''
Int'l J. of Information Technology and Decision Making, World Scientific Pub. Co.
vol. 5, no. 4, Dec. 2006, pp. 638-658.
In this paper, we present constrained simulated annealing (CSA), an algorithm that extends
conventional simulated annealing to look for constrained local minima of nonlinear constrained optimization problems.
The algorithm is based on the theory of extended saddle points (ESPs) that shows the one-to-one
correspondence between a constrained local minimum and an ESP of the corresponding penalty function.
CSA finds ESPs by systematically controlling probabilistic descents in the
problem-variable subspace of the penalty function and probabilistic ascents in the penalty subspace.
Based on the decomposition of the necessary and sufficient ESP condition
into multiple necessary conditions, we present constraint-partitioned simulated annealing
(CPSA) that exploits the locality of constraints in nonlinear optimization problems.
CPSA leads to much lower complexity as compared to that of CSA by partitioning the constraints of a problem into
significantly simpler subproblems, solving each independently, and resolving those violated global constraints
across the subproblems.
We prove that both CSA and CPSA asymptotically converge to a constrained global minimum with probability one
in discrete optimization problems. The result extends conventional simulated annealing (SA), which guarantees asymptotic
convergence in discrete unconstrained optimization, to that in discrete constrained optimization.
Moreover, it establishes the condition under which optimal solutions can
be found in constraint-partitioned nonlinear optimization problems.
Finally, we evaluate CSA and CPSA by applying them to solve some continuous constrained
optimization benchmarks and compare their performance to that of other penalty methods.
``Simulated Annealing with Asymptotic Convergence for Nonlinear Constrained Optimization,''
Journal of Global Optimization, Springer-Verlag,
vol. 39, pp. 1-37, 2007.
In this paper, we present a search algorithm called real-time search for
solving combinatorial optimization problems under real-time constraints.
The problems we study are NP-hard and have good heuristic algorithms for
generating feasible solutions. The algorithm we develop aims at finding
the best possible solution in a given deadline. Since this objective is
generally not achievable without first solving the problem, we use an
alternative heuristic objective that looks for the solution with the best
ascertained approximation degree. Our algorithm schedules a sequence of
guided depth-first searches, each searching for a more accurate solution
(based on the approximation degree set), or solutions deeper in the search
tree (based on the threshold set), or a combination of both. Five versions
of the RTS algorithm for setting approximation degrees and/or thresholds
are formulated, evaluated, and analyzed. We develop an exponential model
for characterizing the relationship between the approximation degrees
achieved and the time taken. Our results show that our RTS algorithm finds
solutions better than a guided depth-first search, and solutions found are
very close to the best possible solution that can be found in the time
allowed. We evaluate our results using the symmetric traveling-salesman,
the knapsack, and the production-planning problems.
In this paper we present a parameter-based approach for synthesizing
systolic architectures from uniform recurrence equations. The scheme presented
is a generalization of the Parameter Method proposed by Li and Wah. The
approach synthesizes optimal arrays of any lower dimension from a general
uniform recurrence description of the problem. In other previous attempts
for mapping uniform recurrences into lower-dimensional arrays, optimality
of the resulting designs is not guaranteed. As an illustration of the
technique, optimal linear arrays for matrix multiplication are given. Due
to space limitation, proofs to theorems in this paper are not shown. In a
companion paper, we present details of the proof of Theorems 1 and 3 and a
different approach for defining the constraints shown in Theorem 2. A
detailed design for solving path-finding problems is also presented there.
In this paper, we develop parallel multistage selection strategies
for guiding the concurrent evaluation of candidates from a large pool of
candidates, with the objective of finding the candidate with the maximum
(resp. minimum) value in a limited amount of time.
We present a statistical model for finding the appropriate number
of candidates to be sampled and when each should be sampled.
We verify the analytical results by simulations and evaluate the
performance of strategies that are analytically intractable.
Our performance results show that there is linear speedup
in parallel processing over selection done sequentially.
Results obtained from application of our selection method
to learn heuristics for a real-world application demonstrate
that heuristics with improved performance can be found
by a systematic search of the space of feasible heuristics in limited time.
Due to space limitation, we are unable to present the details
of the experiments and the exact heuristics found.
Most existing methods for synthesizing systolic architectures can only map
n-dimensional recurrences to n-1-dimensional arrays. In this paper, we
generalize the parameter-based approach of Li and Wah to map n-dimensional
uniform recurrences to any k-dimensional processor arrays, where k
In this paper, we propose a novel search algorithm called band search that
generalizes guided depth-first and best-first searches. The search
allocates a band of at most W nodes in each level of the search tree for storing active nodes in that level, D priority lists, one for each level, for
storing overflow nodes, and D counters, one for each level, for keeping
track of and to limit the degree of backtracking allowed. The algorithm
has three major features: a) it selects for expansion in a best-first
fashion from all nodes in the bands, b) it moves nodes from an overflow
list into the corresponding band in a depth-first fashion, and c) it
restricts backtracking so that at most W nodes in a level are fully searched
before allowing new nodes in this level into the band. The algorithm
specializes to be a guided depth-first search (GDFS) when W is one, and a
best-first search (BFS) when W is unlimited. By choosing W appropriately,
we show empirically that the algorithm can often outperform GDFS and has
performance close to that of BFS. Moreover, the algorithm can be used
instead of GDFS in iterative searches, such as IDA*, MIDA*, and DFS*. Finally,
we identify conditions upon which the algorithm behaves like BFS, and
the anomalous behavior when the algorithm performs worse than GDFS.
This paper presents a method for tuning parameters under a fixed time
constraint for a general binocular stereo-vision algorithm. A major difficulty
in stereo vision, as well as in other vision algorithms, lies in
adjusting the large variety of parameters for maximizing performance. This
effort is usually performed by human experts with a minimum of formal
guidelines. To automate this process, we develop Teacher 4.2, a generate-and-test
system that systematically generates new parameter values by analyzing
the results of previous tests, and that performs limited and controlled
tests on the candidates generated using high-speed computers. The system
is modeled as a statistical selection problem operating under a given time
constraint. It divides the time allowed into stages, where promising
parameter-value sets found in one stage are passed to the next stage for
further testing, and selects the parameter-value set deemed best by the
final stage as the result. We show experimentally that our system can find
new parameter-value sets which in some cases are better than the ones
originally found by extensive hand-tuning and commonly used heuristics. Our
experiments further show that different parameter values may be required
under different objectives and performance constraints. Our system
provides an automated tool for generating new parameters that can be part of
an adaptive stereo-vision system, capable of adapting to changing algorithm
specifications as well as changing goals and target domains.
We present in this paper systolic arrays with constant number of
input/output (I/O) ports for two-dimensional (2-D) FIR and IIR filtering.
Our design has an array of LxN processing elements (PE's), where L
(less than or equal to N) is
a technology-dependent parameter related to the number of I/O ports. Each
PE in our design has a microprogrammed arithmetic logic unit (ALU), a
control unit, a fixed number of I/O buffers, and O(N/L) memory. Our design
specializes to a square mesh when L=N, and a linear array when L=1. It can
implement both FIR and IIR filtering in O(N2M/L) time which is
asymptotically optimal.
Load-balancing systems use workload indices to dynamically schedule jobs.
We present a novel method of automatically learning such indices. Our
approach uses comparator neural networks, one per site, which learn to
predict the relative speedup of an incoming job using only the
resource-utilization patterns observed prior to the job's arrival. Our load indices
combine information from the key resources of contention: CPU, disk,
network, and memory. Our learning algorithm overcomes the lack of job-specific information by learning to compare the relative speedups of different sites with respect to the same job, rather than attempting to
predict absolute speedups. We present conditions under which such learning
is viable. Our results show that indices learnt using comparator networks
correctly pick the best destination in most cases when incoming jobs are
short; accuracy degrades as execution time increases.
In designing application-specific bit-level architectures and in
programming existing bit-level processor arrays, it is necessary to expand a
word-level algorithm into its bit-level form before dependence analysis can
be performed. In this paper, we consider dependence structures of
bit-level algorithms as functions of three components -- dependence structures
of word-level algorithms, dependence structures of the arithmetic
algorithms implementing word-wise operations, and algorithm expansions. Based
on these components, we can derive dependence structures of bit-level
algorithms without using time consuming general dependence analysis methods.
To illustrate our approach, we derive two dependence structures for
bit-level matrix multiplication and apply a method developed earlier to design
two bit-level architectures. One of these architectures is O(p) times
faster than the best word-level architecture, where p is the word length. The
speedup we found here is true in general because a bit in a bit-level
architecture goes to the next processor for processing as soon as it is
available.
In this paper, we present efficient algorithms for adjusting configuration
parameters of genetic algorithms that operate in a noisy environment.
Assuming that the population size is given, we address two problems
specifically important in a noisy environment. First, we study the duration-sizing
problem that determines dynamically the duration of each generation.
Next, we study the sample-allocation (sizing) problem that determines
adaptively the number of evaluations taken from each population in a
generation. For these two problems, we model the search process as a statistical
selection process and derive equations useful for controlling the duration
and the sample sizes. Our result shows that these adaptive procedures
improve the performance of genetic algorithms over those of commonly used
static ones.
The scheduling of tasks for applications with dynamic behavior
traditionally rely on externally observable metrics such as the number of active
processes. This paper presents a new scheduling strategy based on the
observation that it may be possible to capture the near-term resource
requirements of active tasks by expressions involving task parameters. Run-time evaluation of these expressions yields estimates of task behavior that
are valid over a short, future interval of time. The heuristics proposed,
which when used in conjunction with information supplied by profiling, can
be used to annotate the source program with such expressions. Preliminary
simulation results show that the use of near-future estimates in a dynamic
scheduling strategy for divide-and-conquer algorithms consistently improves
over traditional dynamic strategies. The performance of this strategy
approaches that of the best-known deterministic strategy while incurring an
overhead of the same order as other dynamic strategies.
In this paper, we propose a new supervised learning method to improve the
learning speed for feed-forward neural networks. Our method is different
from traditional supervised learning methods (such as back-propagation)
that find a one-to-one mapping between the given input pattern matrix and
the desired output pattern matrix. Instead, it finds one of the
one-to-many mappings between the input matrix and an intermediate output matrix,
and transforms the intermediate output matrix to the desired output matrix
in one step using linear mapping techniques. Learning is faster with our
method because there exist many intermediate output matrices, and learning
can stop whenever one such matrix is found. Our extensive experimental
results show that our learning algorithm converges to within a reasonable
range of error after a few training epochs, making it suitable for dynamic
real-time applications in which the network may need to be re-trained
periodically.
Effective load-balancing policies use dynamic resource information to
schedule tasks in a distributed computer system. In this paper, we present
a novel method for automatically learning such policies. At each site in
our system, we use a comparator neural network to predict the relative
speedup of an incoming task using only the resource-utilization patterns
obtained prior to the task's arrival. Outputs of these comparator networks
are broadcast periodically over the distributed system, and the resource
schedulers at each site use these values to determine the best site for
executing an incoming task. The delays incurred in propagating workload
information and tasks from one site to another, as well as the dynamic and
unpredictable nature of workloads in multiprogrammed multiprocessors, may
cause the workload pattern at the time of execution to differ from patterns
prevailing at the times of load-index computation and decision making. Our
load-balancing policy accommodates this uncertainty by using certain
tunable parameters. We present a population-based machine-learning algorithm
that adjusts these parameters in order to achieve high average speedups
with respect to local execution. Our learning algorithm overcomes the lack
of task-specific information by learning to compare the relative speedups
of different sites with respect to the same site, rather than attempting to
predict absolute speedups. We present experimental results based on the
execution of benchmark programs on a network of Sun workstations connected
by a local area network. Our results show that our load-balancing policy,
when combined with the comparator neural network for workload
characterization, is effective in exploiting idle resources in a distributed computer
system.
In this paper, we present the design of an application-specific coprocessor
for algorithms that can be modeled as uniform recurrences or
``uniformized'' affine recurrences. The coprocessor has a regular array of processors
connected to an access-unit for intermediate storage of data. The
distinguishing feature of our approach is that we assume the coprocessor to be
interfaced to a standard, slow (single-ported) memory with low bandwidth.
Hence, good performance is achieved by effectively exploiting data locality
in the applications by the compiler, and the final architecture is chosen
by a tradeoff analysis driven by the mapping process. Preliminary results
indicate that the coprocessor has significantly lower clock rates or higher
performance than that of existing RISC processors and is cost-effective for
executing loop computations.
This paper presents the position statements of the panelists in a panel
discussion at the 13th International Joint Conference on Artificial
Intelligence.
In this paper, we present a new learning mechanism called mixed-mode
learning. This learning mechanism transforms an existing supervising learning
algorithm from one that finds a unique one-to-one mapping into an algorithm
that finds one of the many one-to-many mappings. Since the objective of
learning is relaxed by this transformation, the number of learning epochs
can be reduced significantly. We show in this paper that mixed-mode
learning can be applied in the well-known cascade correlation learning algorithm
to reduce the number of hidden units required for convergence when applied
to classification problems whose desired outputs are binary. Our
experimental results confirm this reduction, although they show that more learning
time is required than that of cascade correlation. In general, reducing
the number of hidden units at the expense of additional learning time
is usually desirable.
In this paper we present an improved search procedure for the General
Parameter Method (GPM) [GanWah92a, GanWah92b]. Our procedure maps uniform
dependence algorithms to application-specific processor arrays (PAs). It
can optimize general design objectives with certain non-monotonicity
properties, i.e., those that do not increase monotonically with the
parameters. An example of such an objective is the minimization of the total
completion time, including load and drain times. In contrast, earlier
design methods can only deal with monotonic objectives. We present results
for the matrix-product problem using this search technique; application of
the technique on the transitive-closure problem is presented elsewhere
[ForWahShaGan93]. We also show that the parameters in GPM can be expressed
in terms of the schedule vector Pi and allocation matrix S in the popular
dependence-based method (DM), thereby allowing GPM to be used in DM for
finding optimal designs for uniform dependence algorithms.
In this paper, we present a method for finding the minimal configuration of
a feed-forward artificial neural network (ANN) for solving a given
application problem. We assume that the cascade-correlation (CAS) training algorithm
is used to train the weights of the ANNs concerned. Under a given
time constraint that is long enough to train tens of ANNs completely, we
divide the time into quanta, and present a method for scheduling
dynamically the ANN to be trained in each quantum from a pool of partially trained
ANNs. Our goal is to find an ANN configuration with smaller number of
hidden units as compared to the alternative of applying the CAS algorithm repeatedly to train each ANN to completion before exploring new ANNs. Our
system is a population-based generate-and-test method that maintains a
population of candidate ANNs, and that selectively train those that are
predicted to require smaller configurations. Since it is difficult to
predict the exact number of hidden units required when the CAS algorithm
terminates, our system compares two partially trained ANNs and predicts
which one will converge with a smaller number of hidden units relative to
the other. Our prediction mechanism is based on a comparator neural
network (CANN) that takes as inputs the TSSE-versus-time behavior of training
performed already on two ANNs, and that predicts which one will require a
smaller number of hidden units when convergence is reached. We show that
our CANN can predict correctly most of the time, and present experimental
results on better configurations found in a given time limit for a
classification problem and the two-spiral problem.
In this paper, we discuss twelve application-specific issues in strategy
learning. These issues are important when designing strategy learning
systems as well as in learning new application-specific strategies. In
each issue we discuss the problems involved and identified some effective
solutions.
Finding the global optimal solution for a general nonlinear program is a
difficult task except for very small problems. In this paper we identify a
class of nonlinear programming problems called polynomial programming
problems (PP). A polynomial program is an optimization problem with a scalar
polynomial objective function and a set of polynomial constraints. By using
Groebner Bases, we can determine the global minimum of a polynomial program
in a reasonable amount of time and memory.
In this paper, we study the performance of various IDA*-style searches and
investigate methods to improve their performance by predicting in each
stage the threshold to use for pruning. We first present three models to
approximate the distribution of number of search nodes by lower bounds:
exponential, geometric, and linear, and illustrate these distributions based
on some well-known combinatorial search problems. We then show the
performance of an ideal IDA* algorithm and identify reasons why existing IDA*-style algorithms perform well. In practice, we will be able to know from
previous experience the distribution for a given problem instance but will
not be able to determine the parameters of the distribution. Hence, we
develop RIDA*, a method that estimates dynamically the parameters of the
distribution, and predicts the best threshold to use. Finally, we compare
the performance of several IDA*-style algorithms --- Korf's IDA*, RIDA*,
IDA*-CR and DFS* --- on several application problems, and identify
conditions under which each of these algorithms will perform well.
In this paper we present the design of a comparator neural network (CANN)
and its associated learning algorithms. This network is useful for
comparing two time series in order to predict which one will have a better future
value based on a user-specified objective function. We illustrate its
application in distributed load balancing and in an automated design system
for neural networks. In the first application, we use a CANN as a workload
predictor that takes as input a time window of measurable values of CPU
occupancy, network traffic, disk traffic, and memory occupancy from two
computers, and predicts which one of the two computers will have a shorter
turnaround time for executing an incoming job. In the second application,
we use a CANN to predict which one of two partially trained neural networks
will have a shorter convergence time, given current TSSE traces of these
networks. In each application, experimental results show that prediction
is improved using the proposed CANNs.
We consider a class of zero-one integer programming feasibility problems
(0-1 ILPF problems) in which the coefficients of variables can be integers,
and the objective is to find an assignment of binary variables so that all
constraints are satisfied. We propose a Lagrangian formulation in the
continuous space and develop a gradient search in this space. By using two
counteracting forces, one performing gradient search in the primal space
(of the original variables) and the other in the dual space (of the
Lagrangian variables), we show that our search algorithm does not get
trapped in local minima and reaches equilibrium only when a feasible
assignment to the original problem is found. We present experimental
results comparing our method with backtracking and local search (based on
random restarts). Our results show that 0-1 ILPF problems of reasonable
sizes can be solved by an order of magnitude faster than existing methods.
In this paper, we present new results on the automated generalization of
performance-related heuristics learned for knowledge-lean applications. We
study methods to statistically generalize new heuristics learned for some
small subsets of a problem space (using methods such as genetics-based
learning) to unlearned problem subdomains. Our method uses a new
statistical metric called probability of win. By assessing the performance
of heuristics in a range-independent and distribution-independent manner,
we can compare heuristics across problem subdomains in a consistent manner.
To illustrate our approach, we show experimental results on generalizing
heuristics learned for sequential circuit testing, VLSI cell placement and
routing, and branch-and-bound search. We show that generalization can lead
to new and robust heuristics that perform better than the original
heuristics across problem instances of different characteristics.
In this paper, we discuss a new approach to generalize heuristic methods
(HMs) to new test cases of an application, and conditions under which such
generalization is possible. Generalization is difficult when performance
values of HMs are characterized by multiple statistical distributions
across subsets of test cases of an application. We define a new measure
called probability of win and propose three methods to evaluate it:
interval analysis, maximum likelihood estimate, and Bayesian analysis. We show
experimental results on new HMs found for blind equalization and branch-and-bound search.
In this paper, we present a new supervised learning method called Novel
(Nonlinear Optimization Via External Lead) for training feed-forward neural
networks. In general, such learning can be considered as a nonlinear global optimization problem in which the goal is to minimize the nonlinear error function that spans the space of weights. Novel is a trajectory-based
nonlinear optimization method that combines global and local searches to
find good local minima. It relies on an external force to pull a search
out of a local minimum in its global search and employs local descents to
locate local minima in its local search. The result is an efficient search
method that identifies good basins without spending a lot of time in them.
We have shown improved training and testing results for five neural-network
benchmark problems as compared to those of existing minimization and
neural-network learning algorithms. For the two-spiral problem, Novel has
found a design using only four hidden units and 25 weights. (The best
known design requires nine hidden units and 75 weights.) In short, Novel
represents a significant advance in the state-of-the-art in supervised
learning of neural networks and general optimization of continuous non-
linear functions.
Satisfiability is a class of NP-complete problems that model a wide range
of real-world applications. These problems are difficult to solve because
they have many local minima in their search space, often trapping greedy
search methods that utilize some form of descent. Further, gradient
changes in the search space are in integers, leading to possibly many
directions of the same gradient. In this paper, we propose a new discrete
Lagrange-multiplier-based global-search method for solving satisfiability
problems. We derive new approaches for applying Lagrangian methods in
discrete space, show that equilibrium is reached when a feasible assignment
to the original problem is found, and present heuristic algorithms to look
for equilibrium points. Instead of restarting from a new starting point
when a search reaches a local trap, the Lagrangian variables in our method
provide a force to lead the search out of a local minimum and move it in
the direction provided by the Lagrangian variables. One of the major
advantages of our method is that it has very few algorithmic parameters to be
tuned by users, and the search procedure can be made deterministic and the
results, reproducible. We demonstrate our method by applying it to solve
an extensive set of benchmark problems archived in DIMACS of Rutgers
University. Our method generally performs better than the best existing
methods and can achieve an order-of-magnitude speedup for some problems.
Moreover, our method can solve some new benchmark problems that cannot be
solved by other local-search methods.
Uniform recurrent algorithms belong to a fundamental class of operations
used in almost every scientific and engineering computation. An important
example of these algorithms is matrix multiplication. In this paper, we
present optimal designs of bit-level processor arrays for matrix
multiplication, and analyze trade-offs between bit-level and word-level designs in
terms of execution time and chip area. We formulate the design as an
optimization problem and search for optimal designs using the General Parameter Method developed by Ganapathy and Wah. We show that previous bit-level
processor arrays for matrix multiplication are two special cases of our
more general designs, and prove that one of the previous designs is optimal
in execution time. Finally, we show that the fastest bit-level design is
only O(log p) faster than than the fastest word-level design, instead of
O(p) as claimed in the previous work of Shang and Wah, where p is the
number of bits per word.
The satisfiability (SAT) problem is a core problem in mathematical logic
and computing theory. In practice, SAT is fundamental in solving many
problems in automated reasoning, computer-aided design, computer-aided
manufacturing, machine vision, database, robotics, integrated circuit
design, computer architecture design, and computer network design.
Traditional methods treat SAT as a discrete, constrained decision problem.
In recent years, many optimization methods, parallel algorithms, and
practical techniques have been developed for solving SAT. In this survey,
we present a general framework (an algorithm space) that integrates
existing SAT algorithms into a unified perspective. We describe sequential
and parallel SAT algorithms including variable splitting, resolution, local
search, global optimization, mathematical programming, and practical SAT
algorithms. We give performance evaluation of some existing SAT
algorithms. Finally, we provide a set of practical applications of the
satisfiability problems.
In this paper, we present a new global optimization method for designing
QMF (quadrature mirror filters) filter banks. We formulate the design
problem as a nonlinear constrained optimization problem, using the
reconstruction error as the objective, and the stopband ripple, stopband
energy, passband ripple, passband energy and transition bandwidth as
constraints. This formulation allows us to search for solutions that
improves with respect to the objective, and that performs better than or
equal to the best existing designs with respect to the constrained
measures. We present Novel, a global optimization method we have developed
for solving nonlinear continuous constrained optimization problems, and
apply it to find improved designs. We also show that relaxing the
constraints on transition bandwidth and stopband energy will lead to
significant improvements in the other performance measures.
In this paper, we present a new method to handle inequality constraints and
apply it in Novel (Nonlinear Optimization via External Lead), a system we
have developed for solving constrained continuous nonlinear optimization
problems. In general, in applying Lagrange-multiplier methods to solve
these problems, inequality constraints are first converted into equivalent
equality constraints. One such conversion method adds a slack variable to
each inequality constraint in order to convert it into an equality
constraint. The disadvantage of this conversion is that when the search is
inside a feasible region, some satisfied constraints may still pose a nonzero weight in the Lagrangian function, leading to possible oscillations
and divergence when a local optimum lies on the boundary of a feasible
region. We propose a new conversion method called the MaxQ method such
that all satisfied constraints in a feasible region always carry zero
weight in the Lagrange function; hence, minimizing the Lagrange function in
a feasible region always leads to local minima of the objective function.
We demonstrate that oscillations do not happen in our method. We also
propose methods to speed up convergence when a local optimum lies on the
boundary of a feasible region. Finally, we show improved experimental
results in applying our proposed method in Novel on some existing benchmark
problems and compare them to those obtained by applying the method based on
slack variables.
In this paper, we study various global optimization methods for designing
QMF (quadrature mirror filter) filter banks. We formulate the design
problem as a nonlinear constrained optimization problem, using the
reconstruction error as the objective, and other performance metrics as
constraints. This formulation allows us to search for designs that improve
over the best existing designs. We present Novel, a global optimization
method for solving nonlinear continuous constrained optimization problems.
We show that Novel finds better designs with respect to simulated annealing
and genetic algorithms in solving QMF benchmark design problems. We also
show that relaxing the constraints on transition bandwidth and stopband
energy leads to significant improvements in the other performance measures.
Weighted maximum satisfiability problems (MAX-SAT) are difficult to solve
due to the large number of local minima in their search space. In this
paper we propose a new discrete Lagrangian based search method (DLM) for
solving these problems. Instead of restarting from a new point when the
search reaches a local minimum, the Lagrange multipliers in DLM provide a
force to lead the search out of the local minimum and move it in a
direction provided by the multipliers. Since DLM has very few parameters
to be tuned, it can be made deterministic and the results, reproducible.
We compare DLM with GRASP in solving a large set of test problems, and show
that it finds better solutions and is substantially faster. DLM has a
solid theoretical foundation that can be used as a systematic approach for
solving general discrete optimization problems.