Статьи

Прикосновение функционального стиля в простой Java с предикатами: часть 2

В первой части этой статьи мы представили предикаты, которые приносят некоторые преимущества функционального программирования для объектно-ориентированных языков, таких как Java, через простой интерфейс с одним единственным методом, который возвращает true или false. Во второй и последней части мы рассмотрим некоторые более сложные понятия, чтобы извлечь максимальную пользу из ваших предикатов.

тестирование

Один очевидный случай, когда предикаты сияют, это тестирование. Всякий раз, когда вам нужно протестировать метод, который смешивает обход структуры данных и некоторую условную логику, используя предикаты, вы можете тестировать каждую половину изолированно, сначала обходя структуру данных, а затем условную логику.

На первом этапе вы просто передаете в метод предикат всегда истинный или всегда ложный, чтобы избавиться от условной логики и сосредоточиться только на правильном переходе по структуре данных:

// check with the always-true predicate
final Iterable<PurchaseOrder> all = orders.selectOrders(Predicates.<PurchaseOrder> alwaysTrue());
assertEquals(2, Iterables.size(all));

// check with the always-false predicate
assertTrue(Iterables.isEmpty(orders.selectOrders(Predicates.<PurchaseOrder> alwaysFalse())));

На втором шаге вы просто тестируете каждый возможный предикат отдельно.

final CustomerPredicate isForCustomer1 = new CustomerPredicate(CUSTOMER_1);
assertTrue(isForCustomer1.apply(ORDER_1)); // ORDER_1 is for CUSTOMER_1
assertFalse(isForCustomer1.apply(ORDER_2)); // ORDER_2 is for CUSTOMER_2

Этот пример прост, но вы поняли идею. Чтобы проверить более сложную логику, если тестирования каждой половины функции недостаточно, вы можете создать фиктивные предикаты, например, предикат, который возвращает true один раз, а затем всегда false. Подобные предикаты могут значительно упростить настройку тестов благодаря строгому разделению задач.

Предикаты настолько хороши для тестирования, что, если вы склонны делать какие-то TDD, я имею в виду, если способ, которым вы можете тестировать, влияет на ваш дизайн, то, как только вы узнаете предикаты, они обязательно найдут свой путь в ваш дизайн.

Объясняя команде

В проектах, над которыми я работал, команда сначала не была знакома с предикатами. Однако эта концепция достаточно проста и интересна для всех, чтобы быстро ее освоить. На самом деле я был удивлен тем, как идея предикатов естественным образом распространилась из кода, который я написал в код моих коллег, без особой евангелизации от меня. Я предполагаю, что преимущества предикатов говорят сами за себя. Наличие зрелых API от таких известных компаний, как Apache или Google, также помогает убедить, что это серьезная вещь. И теперь, когда шумиха по поводу функционального программирования, продавать стало еще проще!

Простые оптимизации

Этот двигатель такой большой, оптимизация не требуется (автосалон в Чикаго).

Обычные оптимизации сделать предикаты неизменны и лицо без гражданства как можно больше , чтобы обеспечить их совместное использование без учета нарезания резьбы. Это позволяет использовать один единственный экземпляр для всего процесса (как одиночный, например, как статические конечные константы). Наиболее часто используемые предикаты, которые не могут быть перечислены во время компиляции, могут кэшироваться во время выполнения, если это необходимо. Как обычно, делайте это только в том случае, если ваш отчет профилировщика действительно требует этого.

Когда это возможно, объект предиката может предварительно вычислить некоторые из вычислений, участвующих в его оценке, в своем конструкторе (естественно, поточно-ориентированном) или лениво.

Ожидается, что предикат не будет иметь побочных эффектов , другими словами «только для чтения»: его выполнение не должно вызывать каких-либо заметных изменений в состоянии системы. Некоторые предикаты должны иметь некоторое внутреннее состояние, например, предикат на основе счетчика, используемый для подкачки, но они по-прежнему не должны изменять никакое состояние в системе, к которой они применяются. С внутренним состоянием они также не могут быть разделены, однако они могут быть повторно использованы в своем потоке, если они поддерживают сброс между каждым последующим использованием.

Мелкозернистые интерфейсы: большая аудитория для ваших предикатов

In large applications you find yourself writing very similar predicates for types totally different but that share a common property like being related to a Customer. For example in the administration page, you may want to filter logs by customer; in the CRM page you want to filter complaints by customer.

For each such type X you’d need yet another CustomerXPredicate to filter it by customer. But since each X is related to a customer in some way, we can factor that out (Extract Interface in Eclipse) into an interface CustomerSpecific with one method:

public interface CustomerSpecific {
Customer getCustomer();
}

This fine-grained interface reminds me of traits in some languages, except it has no reusable implementation. It could also be seen as a way to introduce a touch of dynamic typing within statically typed languages, as it enables calling indifferently any object with a getCustomer() method. Of course our class PurchaseOrder now implements this interface.

Once we have this interface CustomerSpecific, we can define predicates on it rather than on each particular type as we did before. This helps leverage just a few predicates throughout a large project. In this case, the predicate CustomerPredicate is co-located with the interface CustomerSpecific it operates on, and it has a generic type CustomerSpecific:

public final class CustomerPredicate implements Predicate<CustomerSpecific>, CustomerSpecific {
private final Customer customer;
// valued constructor omitted for clarity
public Customer getCustomer() {
return customer;
}
public boolean apply(CustomerSpecific specific) {
return specific.getCustomer().equals(customer);
}
}

Notice that the predicate can itself implement the interface CustomerSpecific, hence could even evaluate itself!

When using trait-like interfaces like that, you must take care of the generics and change a bit the method that expects a Predicate<PurchaseOrder> in the class PurchaseOrders, so that it also accepts any predicate on a supertype of PurchaseOrder:

public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
return Iterables.filter(orders, condition);
}

Specification in Domain-Driven Design

Eric Evans and Martin Fowler wrote together the pattern Specification, which is clearly a predicate. Actually the word « predicate » is the word used in logic programming, and the pattern Specification was written to explain how we can borrow some of the power of logic programming into our object-oriented languages.

In the book Domain-Driven Design, Eric Evans details this pattern and gives several examples of Specifications which all express parts of the domain. Just like this book describes a Policy pattern that is nothing but the Strategy pattern when applied to the domain, in some sense the Specification pattern may be considered a version of predicate dedicated to the domain aspects, with the additional intent to clearly mark and identify the business rules.

As a remark, the method name suggested in the Specification pattern is: isSatisfiedBy(T): boolean, which emphasises a focus on the domain constraints. As we’ve seen before with predicates, atoms of business logic encapsulated into Specification objects can be recombined using boolean logic (or, and, not, any, all), as in the Interpreter pattern.

The book also describes some more advanced techniques such as optimization when querying a database or a repository, and subsumption.

Optimisations when querying

The following are optimization tricks, and I’m not sure you will ever need them. But this is true that predicates are quite dumb when it comes to filtering datasets: they must be evaluated on just each element in a set, which may cause performance problems for huge sets. If storing elements in a database and given a predicate, retrieving every element just to filter them one after another through the predicate does not sound exactly a right idea for large sets…

When you hit performance issues, you start the profiler and find the bottlenecks. Now if calling a predicate very often to filter elements out of a data structure is a bottleneck, then how do you fix that?

One way is to get rid of the full predicate thing, and to go back to hard-coded, more error-prone, repetitive and less-testable code. I always resist this approach as long as I can find better alternatives to optimize the predicates, and there are many.

First, have a deeper look at how the code is being used. In the spirit of Domain-Driven Design, looking at the domain for insights should be systematic whenever a question occurs.

Very often there are clear patterns of use in a system. Though statistical, they offer great opportunities for optimisation. For example in our PurchaseOrders class, retrieving every PENDING order may be used much more frequently than every other case, because that’s how it makes sense from a business perspective, in our imaginary example.

Friend Complicity

Weird complicity (Maeght foundation)

Based on the usage pattern you may code alternate implementations that are specifically optimised for it. In our example of pending orders being frequently queried, we would code an alternate implementation FastPurchaseOrder, that makes use of some pre-computed data structure to keep the pending orders ready for quick access.

Now, in order to benefit from this alternate implementation, you may be tempted to change its interface to add a dedicated method, e.g. selectPendingOrders(). Remember that before you only had a generic selectOrders(Predicate) method. Adding the extra method may be alright in some cases, but may raise several concerns: you must implement this extra method in every other implementation too, and the extra method may be too specific for a particular use-case hence may not fit well on the interface.

A trick for using the internal optimization through the exact same method that only expects predicates is just to make the implementation recognize the predicate it is related to. I call that « Friend Complicity« , in reference to the friend keyword in C++.

/** Optimization method: pre-computed list of pending orders */
private Iterable<PurchaseOrder> selectPendingOrders() {
// ... optimized stuff...
}

public Iterable<PurchaseOrder> selectOrders(Predicate<? super PurchaseOrder> condition) {
// internal complicity here: recognize friend class to enable optimization
if (condition instanceof PendingOrderPredicate) {
return selectPendingOrders();// faster way
}
// otherwise, back to the usual case
return Iterables.filter(orders, condition);
}

It’s clear that it increases the coupling between two implementation classes that should otherwise ignore each other. Also it only helps with performance if given the « friend » predicate directly, with no decorator or composite around.

What’s really important with Friend Complicity is to make sure that the behaviour of the method is never compromised, the contract of the interface must be met at all times, with or without the optimisation, it’s just that the performance improvement may happen, or not. Also keep in mind that you may want to switch back to the unoptimized implementation one day.

SQL-compromised

If the orders are actually stored in a database, then SQL can be used to query them quickly. By the way, you’ve probably noticed that the very concept of predicate is exactly what you put after the WHERE clause in a SQL query.

Ron Arad designed a chair that encompasses another chair: this is subsumption

A first and simple way to still use predicate yet improve performance is for some predicates to implement an additional interface SqlAware, with a method asSQL(): String that returns the exact SQL query corresponding for the evaluation of the predicate itself. When the predicate is used against a database-backed repository, the repository would call this method instead of the usual evaluate(Predicate) or apply(Predicate) method, and would then query the database with the returned query.

I call that approach SQL-compromised as the predicate is now polluted with database-specific details it should ignore more often than not.

Alternatives to using SQL directly include the use of stored procedures or named queries: the predicate has to provide the name of the query and all its parameters. Double-dispatch between the repository and the predicate passed to it is also an alternative: the repository calls the predicate on its additional method selectElements(this) that itself calls back the right pre-selection method findByState(state): Collection on the repository; the predicate then applies its own filtering on the returned set and returns the final filtered set.

Subsumption

Subsumption is a logic concept to express a relation of one concept that encompasses another, such as « red, green, and yellow are subsumed under the term color » (Merriam-Webster). Subsumption between predicates can be a very powerful concept to implement in your code.

Let’s take the example of an application that broadcasts stock quotes. When registering we must declare which quotes we are interested in observing. We can do that by simply passing a predicate on stocks that only evaluates true for the stocks we’re interested in:

public final class StockPredicate implements Predicate<String> {
private final Set<String> tickers;
// Constructors omitted for clarity

public boolean apply(String ticker) {
return tickers.contains(ticker);
}
}

Now we assume that the application already broadcasts standard sets of popular tickers on messaging topics, and each topic has its own predicates; if it could detect that the predicate we want to use is « included », or subsumed in one of the standard predicates, we could just subscribe to it and save computation. In our case this subsumption is rather easy, by just adding an additional method on our predicates:

 public boolean encompasses(StockPredicate predicate) {
return tickers.containsAll(predicate.tickers);
}

Subsumption is all about evaluating another predicate for «containment». This is easy when your predicates are based on sets, as in the example, or when they are based on intervals of numbers or dates. Otherwise You may have to resort to tricks similar to Friend Complicity, i.e. recognizing the other predicate to decide if it is subsumed or not, in a case-by-case fashion.

Overall, remember that subsumption is hard to implement in the general case, but even partial subsumption can be very valuable, so it is an important tool in your toolbox.

Conclusion

Predicates are fun, and can enhance both your code and the way you think about it!

Cheers,

Единственный исходный файл для этой части доступен для скачивания
cyriux_predicates_par

С http://cyrille.martraire.com/2010/11/a-touch-of-functional-style-in-plain-java-with-predicates-%E2%80%93-part-2/